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ABSTRACT 


Methods are presented (1) to partition or decompose a visual 
scene into the bodies forming it; (2) to position these bodies in 
three+dimensional space, by combining two scenes that make a 
stereoscopic pair; (3) to find the regions or zones of a visual 
scene that belong to its backgrounds (4) to carry out the isolation 
of objects in (1) when the input has inaccuracies. Running computer 
programs implement the methods, and many examples illustrate their 
behavior. The input is a two-dimensional line-drawing of the scene, 
assumed to contain three-dimensional bodies possessing flat faces 
(polyhedra); some of them may be partially occluded. Suggestions 
are made for extending the work to curved objects. Some comparisons 
are made with human visual perception. 

The main conclusion is that it is possible to separate a picture 
or scene into the constituent objects exclusively on the basis of 
monocular geometric properties (on the basis of pure form); in fact, 


successful methods are shown. 
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if the machine is asked to separate the bodies, it must say 


(BODIES ARE AS FOLLOWS 


: (189) (2 7) (3 5 6) (10 15) 
(4 13 14) ) 


If asked to report the triangular priems, it should answer 


(10 15 


Is A TRIANGULAR PRISM) 


This thesis discusses the problems involved in this task. 
What should be done when the information is noisy, some lines 


are missing, etc? 


How can the computer separate the background from the objects 


forming the scene? 


How should shadows be handled? 
How can stereoscopic vision bé used? 
What about ambiguities and optical illusions? 


This thesis also discusses some related aspects of human 


visual perception 


Key words and phrases related to this study are as follows: 


artificial intelligence 


y 
backgtound: 


background discrimination 


classification of images 

CONVERT 

cybernetics 

feature recognition 

geometric objects 

geometric processing 

graphic processing 

graphical commmication 

graphical data 
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heuristic programming 
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image 
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== Computer Review (A. C. M.) index numbers: C.R. 3.61, 3.63, 
4.22, 5.20. 


Why. “id work ‘was chosen ag_& Thess fopic The present work was 
carried out using the facilities of the Artificial Intelligence Group 

of Project MAC, at M. I. T. Currently, the main goal of the 
Artificial Intelligence Group (AI group) is «to extend the way 
computers can interact with the real world: specifically to develop 
better sensory and motor equipment, and programs to control them.» 
{Minsky, Status Report II}. From such efforts, a robot or mechanical 


manipulator has been constructed, consisting of a PDP-6 computer, 


an image dissector camera mechanical arm and hand (see pictures). 


IMAGE DISSECTOR CAMERA 


«These “eyes and hands" are eventually to be able to do reasonably 
intelligent things but first, of course, it is difficult enough to 
get them to do things that are easy for people to do.» {Ibid.} 


An image dissector 
silently watches 

a triangular prism 
in the vision labo 
ratory of the A.I. 
Group. 
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The work was naturally divided into visual information processing 


(computer vision) and manipulation and control of the arm-hand. 
Thus, when I came as a graduate student from the Politécnico de México 
to M. I. T. (Sept. 65) and became associated with the AI Group, I 
found a great interest there in graphical communication with computes. 
Moreover, it was felt that symbol manipulation techniques would be 
relevant to this area. I was fortunate enough to have had some con- 
tact with the LISP language in some of its implementations: 
MB - LISP {McIntosh 1963} * and Hawkinson-Yates- LISP {Hawkinson 64}* 
at the Centro Nacional de Calculo of the Politécnico; in fact, I 
became interested in the area because I felt that 1t would be possible 
to handle two-dimensional structures much in the same fashion as one 
handles lists (that is, one-dimensional structures or strings of 
symbols) in a pattern<driven Language, such as CONVERT {1965}, recently 
finished at that time. 

The area also offered a good opportunity to understand and 
evaluate several techniques, computers, equipment, etc. Consequently 
I decided to work in it. 


* 

~ The parentheses { } always indicate a reference to the 
bibliography at the end of this thesis, where the complete title, 
date, etc., of the paper can be found. 
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SIMPLIFIED VIEW OF SCENE ANALYSIS 


TO THE BUSY READER 


This section presents a general view of the problems 
in the thesis and their solutions; if you are short of time, 
(1) Read the abstract and this section. 


(2) Choose some scenes from section ‘Analysis of many scenes’, 
and observe how the computer perceives them. 


(3) Look through the table of contents, select additional topics. 


oe = Scene analysis is the result of interaction between 
stored in the programs. In all that follows, the optical data entering 
through the Eye is reduced to a line drawing; this pase is called 
pre-processing, and it will be only briefly sketched here. 


After preprocessing, such a The stylized presentation that 


follows is only an example; in 
particular, scene analysis does 
not need to follow the sequence 
pre-processing =p recognition. 
See 'Division of work in 
Computer Vision' in page co. 


line drawing is analyzed in order 
to discover and recognize given 
objects in it. The process is 
called recognition. 

This thesis is concerned 
with recognition. 

We now give a simplified exposition of both processes. Recognition 
will be discussed abundantly in the remainder of this thesis, since 
it is the main topic; readers who wish for more information on pre- 
processing or other approaches should consult the references, for 
instance {my MS Thesis} and {A C Shaw FJCC 68}. See also page 60. 
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Each inhomogeneous square is divided in four ae) >» ignoring 


again the homogeneous sub-squares. 


The process is repeated a few times more. 
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The squares are now reduced to lines and vertices. 


17 


ana cen Bi 


SE er re Ne rag ree gle ieee te Ee tee Le oe Sep as OE EE eee ae 


The resulting analysis gives us the first chance to start 
working abstractly now, instead of continuing in "picture-point 


space." Preprocessing is finished. 


This and the next page 
describe proposed, but still 


Recognition unfinished, parts of the 
system. 


What follows is merely a brief summary of the processes in 
recognition. A more systematic presentation and classification of 
processes in recognition is found in 'Division of work in Computer 
Vision’, on page 60. 

A program would check in the original scene, on both sides of 
each line, for continuation across the line, of textures, local cracks, 
etc. On these and other grounds, shadows would be picked up and 


erased: 
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A line-proposer program studies the abstract or symbolic" scene and, 
using some heuristics and general principles, proposes places where 
it is quite probable that a line is missing: 


These places are searched by a line-verifying program, which is an 
specially sensitive test that uses fine measurements from the ori~ 
ginal scene, and often it will pick up a boundary that was missed 
in the less-intelligent homogeneity phase. Here it can be practical 
to apply a very strict and sensitive test, because the program 
knows very accurately where the line should be, if it really exists 
at all. For example, even if the two faces have almost equal illu~ 
mination the Eye can pick up a thin, faint highlight from the edge 
of the cube. It would have been hopelessly expensive to Look for 
such detailed phenomena over the whole picture at the start. 
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At this stage our program SEE (page 58) comes 
into action. This program treats different kinds of local 
configurations as providing different degrees of evidence 
for 'linking' the faces. This evidence is obtained mainly 


at vertices, and at boundaries between regions. 


A vertex is in general a point of intersection of 
two or more boundaries of regions. These regions might or 
might not be faces of a single body. SEE examines the 
configuration of lines meeting at the vertex to obtain 
evidence relevant to whether the regions involved belong 


to some object. 


For instance, in the vertex configurations "ARROW" and 
“FORK (a complete classification of vertices can be found 
below in table '‘'VERTICES'), 


Ce) 


"FORK" "ARROW 
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the"rokk” suggests linking face a to face b, b to ec, e¢ to ak. 
The "ARROW" links a with b. A "leg'' (which depends on nearly 
parallel lines) would add a weak link, in addition to the ordinary 


"LEG! Matching T's. 
(Weak link shown dotted) (two strong links) 


(or strong) link placed by its ‘arrow’; a "IT" looks for a matching 
"T", and if found, two strong links are placed as shown. Also, a 


"T" counts against (inhibiting, that is) linking a with c, or 


b with c. wk 
b 


These links, for our example, are 


and may be represented as 


{weak links are dotted] 


21 


indicating two groups of linked faces, that is, two bodies: 


(BODY 1. IS 1 2 4) 
(BODY 2. IS 35 6) 


If in addition we give at this point to 
the computer the definition or concept 
of a 'triangular priam', through an ab- 
stract model of it {my MS Thesis}, we 
can get 


(124 IS A TRIANGULAR~PRISM) 
(356 IS A CUBE) 


Analysis of several examples 


A larger variety of kinds of evidence is used in more complicated 
scenes, making the program more intelligent in its answers: 


(1) The links themselves are inhibited by conditions or configurations 
at the neighbor vertices and faces; for instance, in the case 
of a "FORK", the (strong) links indicated below are inhibited: 


(2) The links to the background are ignored [complete descriptions 
of conditions for producing and cancelling links are to be 
found in section ‘SEE, a program that finds bodies in a scene']. 


(3) A hterarchical scheme is used that first finds subsets of faces 
that are very tightly linked (e. g., by two or more links). 
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These "nuclei" then compete for more loosely linked faces 
(faces Linked through one weak link and one strong link 67 ’ 
or one face completely unlinked, except by one strong link-—~¢c). 
By not considering a single link, weak or strong, as enough 
evidence for assigning two faces as part of the same object, this 
algorithm requires two "mistakes" (that is, two careless place- 
ments of links between regions that should not be considered as 
forming the same body) to make an identification error. 


The bodies of the following scenes are found by SEE without 
difficulty. 


Note that of the strong links available to the "FORK" marked with 
an arrow, two were prohibited or inhibited and only one is produced 
by SEE. 
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Dotted Links are weak 


24 


In the following figure, the "FORK" of the big object is missing. 


Statement of Rules We will re-state the rules under (3) of page 22. 


Region (definition). Surface bounded by simply closed curves. 

We will consider the outer background (:16 in fig 'L10', page 59 y* 
to be also a region. 

Nucleus (definition). A nucleus (of a body) is a set of regions. 
Linked nuclei (definition). Two nuclei a and B are linked if 


regions a and b are linked where ae A and be B. 


First rule: If two nuclei are linked by two or more strong links, 
they are merged into a larger nucleus. 


For instances, regions :8 and :}1 are put together, because there 


OQ? Cy® 


exist two strong links among then, to form the nucleus :8-11. 


Maximal nuclei: Starting from nuclei containing individual regions, 


we let the nuclei grow and mergs under the First rule, until no new 
nuclei can be formed. When this is the case, the scene has been 
partitioned into several "maximal" nuclei; between any two of these 
there is at most one strong link. 

For instance, regions :8 and :11 are put together by the First 
rule; now we see that region :4 has two links with nucleus :8-11, 
and therefore the new nucleus :8-11-14 is formed. This last is a 


maximal nucleus. 


Gyr aE 


“For the moment, ignore the colons (:) in front of numbers. The 
name of a region is a number preceded by a colon, such as: 16. 
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The First rule is applied again and again, until all nuclei are 
maximal nuclei; then the following rule is applied: 


Second Rule: If nuclei A and B are joined by a strong and a weak 
link they are merged into a new nucleus, 


mS > )* 


The Third rule igs applied after the Second rule. 
Third Rule: If nucleus A consists of a single region, has one Link 
with nucleus B and no links with any other nucleus, A and B are merged. 


O69) > 


(10 11) does not join the bigger nucleus because (10 11) does not 
consist of a single region. Below, 9 does not join (7 8) or (4 5) 


because 9 has two links: 5) 


The Third rule tends to avoid proposing bodies consisting of a 
single region. 


The next example shows how three "false" links failed to lead 
SEE into error: 


Here three links were erroneously placed but SEE did not get 
confused by then. 


In complicated scenes, coincidences cause two objects to line up. 
As a result, vertices of different objects are merged, two objectively 
different lines appear as one and so on. The next example illustrates 
these ‘phenomena and shows how SEE copes with the problem. 
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SEE transforms the above scene as follows: 
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@ 
FO,  g 


As we see, the nuclei are going to be correctly formed, and SEE will 
also analyze this scene correctly. 


‘he bodies do not need to be rectangular, prismatic, convex. They 


only need to be rectilinear. As we will see later, even curved objects 


may be identified, under certain restrictions (cf. Table 'ASSUMPTIONS'). 


Figure 'BRIDGE' 
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All the bodies in "BRIDGE" are adequately found. A new heuristic is 


used here: Zaiebiaeh 


parce 
three parallel lines comprising regions that are not background, and 
having the background as a neighbor, and a 'T' in the center line, 


originate a strong link, as shown above. 


The following locally ambiguous scene is correctly parsed by 


our program: 


If we add another block to the right, the program makes a mistake and 


fails to see one of the inner cubes: 


Figure 'MOMO' also gets decomposed accurately: 


Figure 'MOMO' 
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The local links allow correct identification of the following body: 


Y- @ - 


If the lateral faces do not have parallel edges, a mistake occurs 


(conservative behavior, page 2!2): 


| 


V2 \ Seas 
oe At left, the above mistake’is not produced 


because vertex A links :2 and :8, by 
the new heuristic introduced in 'BRIDGE'. 
Conclusion 


The performance of this program shows that it is possible to 
separate a scene into the objects forming it, without needing to know 
the objects in detail; SEE does not need to know the 
‘definitions' or descriptions of a pyramid, or a pentagonal prism, 
in order to isolate these objects in a scene containing them, even in 
the case where they are partially occluded. 

The program will be fully analyzed in the following pages. 
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* 
Problems in analyzing a visual scene 


The problem of taking a two-dimensional image (or several such 
images), and constructing from it a three=dimensional interpretation, 
involves many operations that have never been studied, to say nothing 
of being realized on a computer. We will list some of these here; 

a more complete list is found in my M.S. Thesis {MAC TR 37}; some 
have been side-stepped or ignored by the present recognition system}; 
the problems which we did solve are discussed in the text. 


Among the facilities that must be available are: 


a) Spatial frame-of-reference: setting up a model of the relation 
between the eye(s) and the general framework of the physical task, 
i. e., where are the background, the "table" or working surface, 
and the mechanical hand(s)? 


b) Finding visual objects, and localizing them in space with respect 
to the eye-table~background-hand model. 


c) Recognizing or describing the objects seen, regardless of their 
position, accoumting for partly~hidden objects, recognizing objects 


already "known" by descriptions in memory and representing the 
three-dimensional form of new objects. 


d) Building an internal "structural model" of what has been seen, 
for the purpose of task-goal analysis. 


Among the important factors are the effects of: 


1. Both the camera's focus and its depth-of-focus. 
2. Illumination of the objects. Light affects the appearance of 


objects in obvious and subtle ways -- in scenes with multiple 

objects and lights we get complicated shadows, which have to 

be detected or rejected. The boundary between two faces may 

disappear if they get equal illumination from a diffuse light source. 
3. Persp ective and distance effects. Even for geometric objects with 

flat surfaces, the two-dimensional projection of their surface 


* Adapted from Status Report II {Minsky 67}. See also Project MAC 
Progress Report {1967, 1968}. 
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features can take many forms, and the system has to be able to deal 


with all of them. It works both ways, of course: once identified, 
the appearance can give valuable information about the object's 
orientation, size, and even (under some conditions) its absolute 


spatial locations {Roberts 1963}. 


Accidental vs. essential visual features. Two objects of the same 


shape and location can have very different visual presentations 
because of their surface textures and markings. We need to 


distinguish these two-dimensional "decorations" from real three- 


dimensional spatial features. 


Other projects 


Here are the main robot groups at a panel discussion, 


C0060 80C8OC8E 


Chairman: 
DR. BERTRAM RAPHAEL 
Stanford Research Institute 
Menlo Park, California 


problems in the 
implementation of 
intelligent robots 


This session, the second of three sessions on robotry, will 
consist of a panel discussion among technical people in- 
volved in the design and construction of mechanical de- 
vices that are capable of significant independent “‘intelli- 
gent” behavior, usually by means of computer control. The 
projects represented on this panel have drawn upon state- 
of-the-art capabilities in many technologies including 
mechanical engineering, pattern recognition, heuristic pro- 
gramming, neural networks and computer systems. Thus, 
the discussion which will be conducted at a fairly technical 
level should be of interest to engineers and scientists con- 
cerned with the problems of interfacing a variety of disci- 
Pplines, as well as to those interested in learning about the 
nature of current embryonic ‘‘robot’’ systems. 

NOTE: Tickets priced at $5.00 each (including lunch) for 
the all-day tour of “live robot” installations on Wednesday, 
Dec. 11th, will be available at this session. 


32 


1968 
fall joint 
computer 


conrerence 
DECEMBER 9-10-11 
san francisco 
civic center 


Panel Members 
MR. L. CHAITIN 


Artificial intelligence Group 


Stanford Research Institute 
ROBOT STUDIES AT STANFORD RESEARCH 
INSTITUTE 
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MIT 
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RELATED RESEARCH 


Previous work by the author 


CONVERT 

A programming language is described which is applicable to 
problems conveniently described by transformation rules. By 
this is meant that pattems may be prescribed, each being 
associated with a skeleton, so that a series of such pairs may 
be, searched until a pattern is found which matches an expres- 
sion to be transformed, The conditions for a match are governed 
by a code which also allows subexpressions to be identified and 
eventually substituted into the corresponding skeleton. The 
primitive patterns and primitive skeletons are described, as 
well as the principles which allow their elaboration into more 
complicated patterns and skeletons, The advantages of the 
language are that it allows one to apply transformation rules 
to lists and arrays as easily as strings, that both patterns and 
skeletons may be defined recursively, and that as a consequence 
programs.may be stated quite concisely. 


Abstract of Convert paper in Comm. A.C.M. 


Because it is easy to write and modify a program in Convert, 
the language has been extensely used to quickly test 'good' 
and "great" ideas, new algorithms, etc. It is embedded in 
the LISP of the PDP-6 computer (A.I. Group), in the IBM~7094 
{Project MAC-MIT); in the CDC-3600(Uppsala University, Sweden), 
in the SDS-940 (Univ. of California, Berkeley). A paper in the 
A. C. M. and {MAC M 305} describe the language; examples of 
simple programs written in Convert are in {MAC M 346}; a book 
article {Patterns and Skeletons in Convert} is oriented 
toward the Lisp consumers. For our Spanish readers, two 
Bachelor's Theses {Guzman 1965} {Segovia 1967} describe the 
language and processors, and give examples. 


SCENE ANALYSIS 


(1) Polybrick {MAC M 308} {Hawaii 69} is a Convert program that 
works on a scene or picture, expressed as a line drawing, and finds 


parallelepipeds in it. 
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(2) We would like to be able to specify in some suitable notation 


models of the classes of objects we are interested in (such as ‘cube', 


‘triangular prism', 'chair'), and make a program look for all instan- 
ces of any given model in a given scene or figure. Twe arguments 
would have to be supplied to our program: the model of the object 


we are interested in, and the scene that we want to analyze. 
Programs to do this are described in {AFCRL~67-0133} and {MAC M 342}. 
In these early programs, partially occluded objects get incorrectly 


identified. These programs are also written in Convert, and work 
by transforming or compiling the model, written in a picture descrip- 
tion language, into a Convert pattern, which searches the scene for 
instances of the model. . 

(3) A Master's Thesis {MAC TR 37} discusses many ways to identify 
objects of known forms. Different kinds of models and their proper- 
ties are analyzed. . 

(4) It {is important to be able to find the bodies that form a scene, 
without knowing their exact description or model. SEE is a program 
that works on a scene presumably composed of three-dimensional 
rectilinear objects, and analyzes the scene into a composttion of 
three-dimensional objects. Partially occluded objects are usually 
properly handled. This program was discussed in {MAC M 357}, 
{Guzman FJCC 68} and {Pisa 68}, and this thesis discusses a later 


version. 


(5) The present thesis goes beyond these topics to discuss also 
handling of stereo information (two views, left and right, of the 
same scene), improvements to deal with noisy (imperfect) input, 
figure-background discrimination, and a few other subjects. 


Canaday 
ee 


Rudd H. Canaday in 1962 analysed scenes com- 
posed of two-dimensional overlapping objects, “straight- 
sided pieces of cardboard.”’ His program breaks theimage 
into its component parte (the pieces of cardboard), de- 
scribes each one, gives the depth of each part in the 
image (oF scene), and states which parts cover which, 
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Roberts 
ee 


The problem of machine recognition of pictorial data has long been a 
challenging goal, but has seldom been attempted with anything more com- 
plex than alphabetic characters. Many people have felt that research on 
character recognition would be a first step, leading the way to a more gen- 
eral pattern recognition ‘system. However, the multitudinous attempts at 
character recognition, including my own, have not led very far. The reason, 
I feel, is that the study of abstract, two-dimensional forms leads us away 
from, not toward, the techniques necessary for the recognition of three- 
dimensional objects. The perception of solid objects is a process which can 
be based on the properties of three-dimensional transformations and the 
laws of nature. By carefully utilizing these properties, a procedure has been 
developed which not only identifies objects, but also determines their orien- 
tation and position in space. 

Three main processes have been developed and programed in this report. 
The input process produces a line drawing from a photograph. Then the 
three-dimensional construction program produces a three-dimensional ob- 
ject list from the line drawing. When this is completed, the three-dimen- 
sional display program can produce a two-dimensional projection of the 
objects from any point of view. Of these processes, the input program is the 
most restrictive, whereas the two-dimensional to three-dimensional and 
three-dimensional to two-dimensional programs are capable of handling 
almost any array of planar-surfaced objects. {from Roberts t 


Roberts in 1963 described programs that (1) con- 
vert a picture (a scene) into a Jine drawing and (2) pro- 
duce a three-dimensional description of the objects 
shown in the drawing in terms of models and their 
transformations. The main restriction on the lines is 
that they should be a perspective projection of the sur- 
face boundaries of a set of three-dimensional objects 
with planar surfaces. He relies on perspective and 
numerical computations, while SEE uses a heuristic and 
symbolic (i.e., non-numerical ) approach. Also, SEE 
does not need models to isolate bodies. Roberts’ work is 
probably the most important and closest to ours. 


Mechanical Manipulator Groups (see also page 32 ). 


Actually, several research groups (at Massachusetts 
Institute of Technology, * at Stanford University, ™ 
at Stanford Research Institute “) work actively to- 
wards the realization of a mechanical manipulator, i.e., 
an intelligent automata who could visually perceive and 
successfully interact with its enviornment, under the 
control of a computer. Naturally, the mechanization of 
visual perception forms part of their research, and im- 
portant work begins to emerge from them in this area. 


35 


THE CONCEPT OF A BODY 


In this section definitions of a body or object will be proposed. 


The criterion is that they agree in general with the common use of 


the word 'body', while at the same time they should lead themselves 
to implementation into a computer program. 


Introduction 


Our ultimate interest is to examine a two-dimensional scene (a 
picture, line drawing, or painting), presumably a representation 
(projection, photograph) of a three-dimensional scene (a subset of 
the "universe" or "real world") and to find {n it objects or bodies 
contained in the real scene. More specifically, the aim is to find 
the two-dimensional representations (projections, photographs) of 
the different three~dimensional bodies present in the scene, 

The phrase "two-dimensional representation of a three~ 
dimensional body" will be shortened to "two-dimensional 
body" or even to "body", when no confusion arises. 

That is, we have to analyze a two-dimensional scene into collections 
of two-dimensional entities (surfaces, regions, lines), each of which 
makes "three~dimensional sense" as a two-dimensional projection 
of a three-dimensional body. 


The problem is inherently ambiguous 


A scene can be considered as a set of surfaces (faces or regions), 
a body belonging to that scene is then an “appropiate” subset of this 
collection. Therefore, the problem of finding bodies in a scene is 
equivalent to the problem of partitioning the set into appropiate 
subsets, each one of them representing or forming a bedy (scene "CHURCH"). 
The problem is inherently ambiguous, since different collections 
of three-dimensional bodies can produce the same 2-dim scene, therefore 


a given scene can be partitioned in many ways into bodies. 
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It is desired to make a L0 | 
"natural" partition or decompo- i . 
sition of the scene, natural in i= 
the sense that will agree with 
human opinion.* 
To define a three- 

dimensional body is no problem 

Set of eight elements. Adequate 


{a philosopher may disagree, subsets (bodies) are [2 4], 

perhaps in singular cases]: {13567 8]. In a more com 
plicated example, people may 
differ in their parsing of scenes. 


Figure ‘CHURCH' 


Three-dimensional body (definition): 


A connected. volume limited by a 
continuous, two-sided surface composed of 
portions of planes. 

Restriction: The above definition covers only polyhedral bodies, 

that is, those having flat faces. 

Restriction: No holes. 

No-restriction: Bodies do not need to be convex. 
‘Roughly speaking, a three-dimensional body is something that does not 
fall apart into pieces when lifted [this may be used as an operational 
definition of a body, given a mechanical manipulator to make the neces- 
sary tests]. 


Given a three-dimensional body, we generate a two-dimensional body 
by taking a picture of it, as follows. 


Tyo=dimenstonal_body (definition). Figure formed by. the projection: of 
a three~dimensional. body. ‘Generalty ‘the: ‘an 
tions is isometric ox perepactive. 
Thus, this is a view in two dimensions, os a me hots: ‘from me 
particular point of ‘view. ae 
 Unfortunstely, a two-disiengional body, could bon ia: hie wey. from 
any of: several ‘different 3<dim bodies or, what is worse, “two ‘Sedia dies 


together can give. ‘tise to a single 2<dim body. For inatanee; in. fig. ! BENT", 


*Without such @ requirement, the’ problem has a trivial solution 
(see Metatheorem in page 39). 
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Figure. ‘3 EXT 
Two Bhoeks ora bent brick. 


this Cees detente body. could ‘be generated. by. a. eis. briec" or, ii 
two blecke adfacenti'to each ‘other. We are dealing with one three- — 
dimensional body in the first case, with Sree: Ss; the: serqns...: But. the: : 


2-dim entity (namely, the drawing of figure 'BENT') ie the sane, and 
ee ne ee ee 


Sibelius! Moment yore arin Wes TW given in Fig. ‘srReLIUS', 
which could be the representation of 365° oe bodies, or the 


Pre 


picture of a =e (ose alae in ioieeee eG T ys 
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Such colorful contradictions point towards the need to lay down 


a more careful definition of our task. For instance, no one would think 


that figure 'CUBE' 


Fig. 'CUBE' 
No one would think... 


contains three bodies. Nevertheless (see fig. 'PARALLELEPIPED' in 
next page), that could be the case. 

These two extremes are to be avoided by an appropiate definition 
of a body and the aponepoasiay computer program. 


Legal scene That 2< ee scene “in which each line is boundary of some 


region. 


Legal scene. ° Tllegal. Tllegal. 
See also comments to scene R3, and *TLiegad Srenes' (page 217), in 
section 'On noisy input’. 
path “any legal scene can always be the projection of one or 
more three-dimensional objects." 


To prove it, it suffices to note that each legal scene is composed ° 


of regions [2 
basis of a pyramid, all the faces meeting. at the cuspid occluded by 
the basis. Say 

Therefore, each legal scene can be. obtained by projecting or 
photographing an adequate arrangement of such pyramids. 


» and each of them could be interpreted as the 


We can always construct a 
legal scene by photographing 
(or projecting) suitable 
3~dim polyhedra. 
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ay SRP ee 


'RrPARALLELEPIPED! 


An improbable decomposition of a scene. 


Figure 
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the 
ued pares By, use of the metatheorem, we can always find a 


decomposition of a visual scene into three~dimensional bodies; we 
call this answer "trivial", Humans do not split scenes this way. 
Our program should not, either. 

But the metatheorem points out that “impossible scenes" are ne~ 
ver found among the legal scenes (see section 'On Optical Illusions')}3 
these always have at least one interpretation, [ed of ‘tviel pertitin”). 


We are trying to give criteria for proposing bodies that will 
suit our ends, which are to define a "reasonable" or "standard" body. 
This will permit us to judge the performance of a program designed 
to find objects in a scene. 


Several criteria are possible: 


1. Roberts {1963} suggests: given several models of three-dimenstonal 
bodies, use some numerical techniques, such as least squares 
fitting, to find which model fits best through a suitable 
transformation, and accept this match if the error is tolera~ 
bly small. Complicated compositions of elementary bodies 
are considered, 

2. Ledley {1962} would propose: in terms of suitable primitive components 
(arcs, Legs, etc.), make a syntactical analysis of the scene, 
with the help of a grammar, in such a way that the models of 
the object you want to identify are formed recursively from 
these primitive components and (perhaps) other bodies. 
Narasimhan {1962} and Kirsch {1964} would agree on this 
linguistical approach. A. C. Shaw {Ph. D. Thesis} assents. 

3. Guzman {1967} suggests: prepare models which specify a fixed 
topology but where other relations (length of sides, paralle- 
lism of two lines, equality of angles) are specified through 
the use of open variables (UAR variables, in CONVERT). 

Evans {1968} would agree with that. 


These approaches require the existence of a model which describes the 
object to be identified; the model specifies a particular 3-dim object 
(or a class of them). These approaches are answering more than what 
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was asked; they tell not only "yes, it is a body’, but also. 

"it is a pyramid". The current question is more. general. 

It is desired to know if something. ig a body, any body, 
even one which has not been seen before. 


If it were possible to implement a program to answer that question, 
then that would be a working definition of a body. SEE is a program 
which comes close to this goal, so that it could be pragmatically stated: 


2-dim body "a _la SEE" (definition). A body is each set of regions. 
recognized by the program SEE as auch. 
This definition allows the following 
Criticism: A perfect way to hunt lions is to 
capture any entity E, and to call 
that a lion, by definition. . 
‘That is, although this ‘definition is precise, ‘SEE may make 
decisions "contrary to common sense" ; also, for purposes of judging 
the behavior of the program, this definition is. useless, since SEE 
will be perfect 100 per cent of the time, irrespective of its answers. 


We are, finally, tempted to conclude that ‘commen sense", or 
better, “human common sense" plays a role in the definition of a body, 
since what we are trying to characterize is a. ugual ‘body, normal body, 
common body, etc. But even people may differ. in their -parsings of 
scenes, We could, of course, give a scene: (such as ‘MOMO! in page77) 
to 100 subjects, ask them to identify the different bodies. in it, and 


come up with some sort of 'average' or ‘general. consensus’: 


2-dim body (statistical and human-behavioral definition). "Each one of 

. the subsets into which a scene is partitioned by many subjects. 

It is ‘understood that, in this spirit, the ‘human objects should be 

motivated to satisfy a 2: _ 

Simplicity criterion: Of the several "reasonable" interpretations 

nm (décompositions) of a scene, the one which 

contains the smaller number of bodies is 
preferable. 
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That ig, an explanation or decomposition is simpler (and preferable) 
if it can be done with fewer parts. 

Simplicity is not to be achieved at any cost, since the parsing 
of the scene has to produce 'plausible' bodies, since "simplicity" 
could be always achieved if each scene is reported as a single, 
gigantic body, obtained perhaps from more familiar ones through liberal 
use of adhesives (cf. also Sibelius’ Monument). 


The chief choices are surely: 
== To choose a parsing, or 


‘== To list many (perhaps rank<ordered) in case of ambiguity. 


If we select the first alternative, further choices are 
== to have a natural parsing (human). 
== to have a canonical parsing, in the sensé of minimizing 
some variable (the minimization of the number of bodies 
leads us to Sibelius’ Monument, its maximization to the 
Trivial Solution of the metatheorem [page 4! ]). 
Other kinds of 2-dim data We have been discussing identification of 
3-dim bodies (through their 2-dim projections) in a 2-dim scene, 
purely on the basis of geomet ric regions, “Many other kinds of infor- 
mation could be used, such as texture, color, and shadows. 


Nevertheless, it is interesting 
to see how far the identification 
of bodies can go if only geometric 
properties are used. 
eee Finding bodies in a 2-dim scene is a task not very precisely 
. defined, because of the ambiguities inherent in any projection process. 
‘Onthese grounds, the concept of 'body' is best described through 
familiarity, human opinion and consensua. We are forced to this because 
any scene could be partitioned in several ways (cf. fig. ‘PARALLELEPIPED’ ) 
only some of which may be considered plausible or ‘sensible’ (natural, 
common, standard) partitions in regard to the bodies forming it. 
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TOTAL ANALYSIS OF VERTICES 


Synopsis Here a scene is considered as formed by several regions; 
bodies are adequate collections of regions. The problem of identifying 
bodies is restated as the problem of finding whether two regions 
belong or do not belong to the same body, Thig question is answered 
by examining the vertices of the scene. 

It is shown that a single vertex never conveys conclusive 
evidence, so that at least a pair of vertices is required to isolate a 
body; familiar and unfamiliar configurations of objects help to under- 
stand how the vertices are to be used in this task. 


Vertices are the important feature 
All faces of polyhedra are bounded 
by edges. 


All edges terminate in vertices. 


== This thesis deals with the analysis of visual scenes composed 
mainly by three~dimensional planar objects g 


== These are limited by flat surfaces 


== All these bodies share as a common feature the edge: place. where 
two planes [faces] meet (but see page $7 ). 


== Wherever several edges or faces meet, a vertex appears. This is 
also a common feature for all the bodies. 


EZ 

A body is formed by vertices with edges connecting some of these. 
When a 3~dim body is projected into a 2-dim body, its 3-dim vertices 
(which we will call genuine 3edim vertices) are transformed into 
genuine 2-dim vertices, known as images of the 3-dim vertices, as 
figure 'GENUINE' (in next page) ‘{ndieates. 

That is, a genuine 2-dim vertex. has come from a. genuine 3-dim 
vertex. Some 2-dim "false" vertices appear too; they do not come 
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Two 3-dim 
bodies, one 
of them 
showing 

its genuine 
3-dim 
vertices. 


A 2-dim 
scene 
contai- 
ning two 
2-dim bodies, 
one of them 


showing its 
genuine 2-dim 
‘wertices. 


Three false 
» vertices also 


appear. 


Figure ‘GENUINE! 


A genuine vertex (such as G,") is one whose counterimage 
(G, in this case) belongs to some body; a false vertex 
such as F,', is a virtual intersection, and generaily 
has no counterimage in the 3-dim world. See fig. 'NODES'. 
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from genuine 3-dim vertices, but rather from the partial occlusion 

of parts of opaque bodies [transparent objects give rise to different 

kind of false vertices; Guzman [MS Thesis} deala with them by using 

transparent models, and a mode of operation of TD, the recognizer, 

that re-interprets or ignores certain types of vertices. {AFCRL-67-0133}]. 
False vertices do not belong to any object. | 

Genuine and false vertices The classification of vertices into 

categories "genuine" and "false" will allow isolation of objects ina 

picture; in fig. 'GENUINE', elimination of vertices F,' ; F,' : and F,' 

divides the genuine nodes of the network (see fig. 'NODES') into two 

non=connected components, A and (1 , correctly separating the two bodies. 


: Figure 'NODESs' 
False vertices arise from the intersection of two 
projected edges, one of which is typically occluded 
in part by a face bordered by the other, Elimination 
of the: falee-nodes F;' + ¥2!-and.¥3' disconnects 
the network in two separate components, which are 
the bodies sought for, 


This suggests the following 
Z-dim body (first approx. to definition). Set of regions possessing 
only genuine vertices, and separated from other bodies 
by false vertices. 
In this way, the problem of identifying bodies is equivalent to the 
problem of identifying genuine vertices, segregating the false ones. 
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Eroblems to _be_solved The computation of this equivalence is challenged 


by several problems: 
== The distribution and position of bodies may be such that false 
vertices look like genuine vertices (fig. 'CAUTION'). 


Fig. ‘CAUTION 
That vertex looks genuine, but is false. 


Global information (analysis of more than one vertex) is needed 
in general to distinguish them. In other words, although false 
vertices are those which separate two bodies, and 2-dim genuine 
vertices originate from 3~dim genuine vertices, to segregate 
them requires more than the simple analysis of their shape. 


== Some genuine vertices look like false vertices. 
» TIM 


== Genuine vertices of a body may not be present in the scene, or 


may be supplanted by false vertices. 


== Continuation is not clear; some doubts arise if the object in 
_the foreground covers one or two bodies (fig. 'CONTINUATION'); 
the simplicity criterion prefers the single body interpretation. 
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Fig. 'CONEINUATION' 
Continuation is not clear. 


In brief, difficulties are of two kinds: 


= Genuine and false vertices can not be distinguished 
locally (see Theorem below). 


== Even when they are completely classified, problem of 
fig. ‘CONTINUATION' remains. 


The solution of these problems will have to make use of more global 
information. 
Classification of Vertices om. table ‘VERTICES’ in next page classi- 
fies vertices according to their form, number of linea and angles 
among the lines. It contains the most common types; vertices having 
more edges could have been included. 

Let us consider one of these types, ARROW. Three regions called 
1, 2, and 3, form it. The standard, most common 
ARROW configuration is a body with faces 1 and 2 3, . 2 
seen against some other object 3. We indicate 1 
this by [ (1 2) (3) ]. However all other configurations are possible: 


{ (1) (2) (3) J 


[(i 3)(2) 
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'L',= Vertex where two ‘FORK'.- Three lines forming angles 


lines meet. smaller than 180 degrees. 
'ARROW'.- Three lines meeting at ‘pt .— Three concurrent lines, two 
a point, with one of of them collinear. 


the angles bigger than 
180 degrees. 


'K'.- Two of the lines are 'x',.= Two of the lines are collinear, 
collinear, and the other and the other two fall on 
two fall on the same side opposite sides of such lines. 


of such lines, 


'PEAK'.= Formed by four or more ‘MULTL'.- Vertices formed by four or 
lines, when there is an more lines, and not falling | 
angle bigger than 180°. in any of the preceding types. 


TABLE 'VERTICES' 
Classification of rectilinear vertices. 
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Thus, for an ARROW, all the groupings of its faces are possible; any 
procedure that, by looking at an Arrow tries to decide how ite faces 
are grouped into bodies, will always make mistakes. 

The generalization of the above analysis to all other types of 
vertices proves the following 


Theorem". There does not exist a set of local decision procedures 
[uy], each one looking or getting information from one vertex 
and establishing b-equivalences among some of their faces 
(two faces a and b are b-equivalent, indicated a=b, if 
the My decides that they belong to the same body; this is 
an equivalence relation), using information only from that 
vertex (it does not look at the other vertices or at the values 
of the p's at the other vertices), which will partition all 
scenes correctly, 


That is, the following machine will not work for all scenes: 


Figure 'M AC HIN Et 
The decision procedures wu, , represented as ‘eyes’ & here, 
decide by processing information at exactly one vertex; | 
the box in the right accepts all these decisions and passes them as 
results. No matter what set of By we choose, there exists a scene 
that induces an incorrect partition by our machine. 
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A stronger assertion is that, in view of inherent ambiguity, 
there is not even any global procedure! ea 


All the different groupings of regions of a vertex into bodies 
are possible; this is illustrated by the following complete set of 
scenes, each one of them showing a different partitioning of a type 
of vertex, These examples are useful also in giving an idea of 
unusual, as well as familiar scenes; we will have later occasion to 
use them, when searching for heuristics to form bodies. 


Generation of part itions 


There are only two partitions of a 
compo ( (12) ) set of two elements. a ? 
f ((€1) (2)) 
2 ((1 2)) 
2 
Partitions of a set of Sd 
elements \, 
compo ( (1 2 3 &) 
Partitions of a set “4 2 (a3 Gan 
epee 3 (1 3) (2) (4)y 
4 (C1 &) (2) (3)) 
5 (C1) €2°3) (#)) 
compo ( (123) ) € (¢€1 2 3)°¢b)) 
\ ((1) (2) (3)) 7 (€1 &) (2 3)) 
2 (€1 2) (3)) 3 ((1) (2-4) (3)) 
3 ((1 3) (€2)) q ((1 2 4) (3)) 
4 ((1) (2 3)) 10 ((1 3) (2 &)) 
5 ((1 2 3)) uu ((1) (2) (3 &)) 
5 2. ((€1 2) (3 &)) 
3 «#((1 3: 4) (2)) 
m+ ((1) (2 3 &)) 
5 ((1 2 3 4&)) 
15 


Figures in the next few pages are 


numbered according to the numbers 
in the leftmost column in these 
tables. 
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CORNER 


ARROW 


Y 


N 
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B 
Py 
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MULTI 


PEAK 
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Digression 1, An alternate approach : 
ea a SI SPSS AS Suggestion 


As an alternate approach, one could try to use the faces as a 
basis for identification. For instance, use two scenes (left image, 
right image) or pictures, localize a sharp feature in one of them 
(vertex, crack in the face, peculiar texture, etc.) and by correlation 
or some other method, find it also in the other picture. Having 
found a few points in both images in this manner, determine the plane 
of the face, in 3-dim space. When several faces are thus identified, 
we can compute, if desired, their intersection and obtain the edges 
(lines). It will generally suffice to ignore the edges and rely on 
the faces. Since it is reasonable to expect considerable difficulty 
in finding lines and in differentiating lines caused by edges from 
those caused by shadows, an approach which avoids the lines altogether 
looks promising. But in this case, in addition to requiring two 
images, several correlations are needed (if we choose this method), 


a generally time-consuming and error-prone task. 
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S EE, A PROGRAM THAT FINDS BODIES IN A SCENE 


Synopsis 


How SEE works. . 


Algorithms and heuristics are presented, implemented in a 
program, that analyze a scene into a composition of three-dimensional 
objects. Only the two-dimensional representation of the three- 
dimensional scene is available as input,and is described by a 


collection of surfaces, lines and vertices. 


SEE looks for three-dimensional objects in two-dimensional scenes. 
The program does not require a pre<conceived idea of the form of the. 
objects which could appear in the scenes. It is-only assumed that 
they will be solid objects formed by plane surfaces. Thus, SEE can 
not find "pentagonal prisms" or "houses" in a scene, since it does 
not know what a "pentagonal prism" is; but it will usually isolate 
the pentagonal prisms (or any other regular or irregular solid) in a 
scene, even if some of them are partially occluded, without having 
a description of such objects, It does this by paying attention 
to configuration of surfaces and lines which would make plausible 
three-dimensional solids, and in this way 'bodies' are identified. 


The analysis that SEE makes of the different scenes generally 
agrees with human opinion, although in some ambiguous cases they 
tend to be conservative. The most interesting thing about the 
program is how well it deals with occlusions. Many examples in 
the next section ‘Analysis of many scenes’ illustrate the features 
and peculiarities of the program, and also illustrate the effects 
of inaccuracies introduced in the data. 
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EL ghey nee) 


A 
acces analyzed by SEE. is : 


INTRODUCTION 


Here is a program that locates objects in an optical image of a 
scene most likely composed by three-dimensional solids, perhaps 
occluding one to another, so that some of them may not be totally 
visible. We use a line drawing as our representation of the scene. 

The analysis of scene L10 (see figure 'L10' in next page) by 


our program, named SEE, produces 


(BODY 1, IS 5 81 34 812) 
(BODY 2, IS 26 235 87 812 834) 
(BODY 3. 1S 28 29 310 33) 
(BODY 4. 1S %2 213) 


Division of work in computer vision 


In trying to construct a program for seeing, several approaches 
are possible; most of them require some of the following set of 


modular programs or subroutines. 


Pre-processing. Converts the image from a 2-dim array of intensities 
to a symbolic representation or ‘internal format' (page 66 ), in 
terms of vertices and lines connecting them. 
Homogeneity predicates. They decide if areas of the picture are 
inhomogeneous, and hence require further analysis (page IG). 
Color predicates. Boundaries of different color suggest lines. 
Line finder. Locates lines of points having certain property 
(such as being inhomogeneous, or having a large light intensity 
gradient). 
Vertex finder. Concurrent lines are merged, or a vertex is created 


at their meeting point. 


Consolidator. Eliminates the false lines and finds more lines, 
incrementing in this way as much as possible the reliability of the 


system. 
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Illumination program. Discovers where the main light sources are. 
Shadows program. Detects shadows so as to eliminate them. 
Missing lines program. General shape considerations suggest places 


where faint lines can remain undetected. 


Body recognition. Partitions the scene into appropiate subsets, each 
one being a body or object. Thus, SEE is a body-recognition program. 


Object identification. These objects are compared against abstract 
descriptions (models) of cubes, pyramids, etc., so tmt a classification 
is done, and a name is attached to each one. In the process, certain 


parameters may acquire values: the height of the pyramid is observed. 


Positioning. Having analyzed the scene, the relevant objects are 
positioned in three~dimensional space, and additional relations among 
them are discovered (support, obstruction, etc.). Enough informtion 
is obtained to allow the mechanical arm to manipulate the objects and 
achieve its goals. 

Stereo. More than one view are analyzed (page 233) and from them, 

3-dim spatial positions are found. 

Focussing. The computer, by adjusting the focus of its lens, 


acquires knowledge of how far the objects are. 


Feedback among these parts is more necessary as the complexity of the 


scene and of the desired goals increases. 


Recognizer. The task of body recognition and body identification was 
formerly accomplished by a single program (for instance, DI or TD {my 
MS Thesis}) that compares the symbolic description of the scene against 
the symbolic or abstract description of the model of the desired object, 
in a kind of two-dimensional matching, to isolate instances of that 


object in the scene. 


Technical descriptions of SEE 


1. Annotated listings. Above all, the primary source of information 
is the listing of the programs, that appears complete in this thesis. 
They are wrjtten in Lisp. Ef, despite my efforts, some of my explanations 


are not clear, consult it: it is annotated. The programs themselves, 
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examples, test data, results, instructions, etc., are in the DEC- 
magnetic tape "GUZMAN F" at Project MAC (AI group}. Instructions 
are given in page 78. 

2. This section of the thesis contains a description and discussion 
of the different algorithms and procedures used. 

3. Published papers that cover part of the material at somewhat 

less depth, and therefore are more readable, are also available 

{FJCC 68} {Pisa 68}. Except that they contain some examples not 
included here, they contain no other information not covered here. 

4. An internal report {MAC M 357} described an earlier version of SEE. 


I 


FIGURE 'R 3' 
A scene. 
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INPUT FORMAT 


Eventually, several preprocessors will be able to receive data 
through an input camera and reduce it to the "internal format" of a 
scene, in the form required by SEE. For testing purposes, the scenes 
are entered by hand in a simplified format, called ‘input format’, 
to be described now. All the scenes analyzed by SEE have been written 


= 


in input format, 


Example. R3 . The input format of scene R3 is 
(DEFPROP R3 (X37) BACKGROUND) 
(NOT (SETQ R3 (QUOTE ¢ 


ZA 64,3 4.5 (437 £46 £84 “£0 481 XB) 
“4B 64.0 5.7 (X37 2A X31 XD) 

42C 8604.8 6.5 (234 “AF X82 XD 231 KA) 
xv «4.5 9.15 (437 XB X21 XC X22 KE) 
4E 5.65 9.25 (487 £D X82 2“F) 

“F 5.85) 8.6 (487 ZE X32 XC 434 XG) 
2S 6.6 Se2 (437 KF X84 KAD R3 IN INPUT FORMAT 
ZH 6.9 15.4 (487 HL 423 XK X25 XI) 
“41 86.5 «16.0 (X87 XH 435 XJ) 

43 31.68 12.6 (437 &L R85 ZK 436 XN) 
ZK 20.0 11.9 (436 xJ 435 ZH x83 xm) 
KL 7.10 1502) «6(K%87 4M KBT3 XH) 

%M 10.0 Ve7F (437 ZN 426 XK x23 XL) 
aN 11.65 16.3 (x37 xJ x36 4M) 


The first line declares :7 to be the background.* We have to 
tell SEE which regions belong to the background. If this informattior: 
is missing, a program is called that will compute the regions that 
belong to the background (see section ‘Background discrimination by 
computer') prior to other calculations. 

After that, the lines associate with each vertex its 2-dim coordi- 
nates and a list (which will later be called 'KIND'), in counterclock- 
wise order, of regions and vertices radiating from that vertex. 

The function PREPARA (see listing) converts the scene as just given 
to the "internal format" form which SEE expects. It does this by putting 
many properties in the property lists of the atoms representing vertices 


and regions (property lists in Lisp get explained in next page). 


*For the moment, ignore the % signs. They are used to distinguish 
right from left scenes. 
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* 
Ebepe tty Piers ie tse Each atomic expression in Lisp has a 


property list, which is a place where facts can be stored. 

If it is desired to represent the fact that John is a 69 years 
old male, has a wife called Jacqueline, and a height of value 1.77 m, 
we could proceed in Lisp as follows: 


(1) We will agree that the atom TJOHN' will represent our man. 


(2) In the property list of ‘JOHN we will store several properties 
or indicators and their values, using the function PUTPROP, that 
stores information in the property list; thus 
(Putprop (quote John) (quote Jacqueline) (quote Wife) ) 
will add, under the indicator or property 'Wife', the value 


' t. 
Jacqueline’: JOHN 


WIFE ————— JACQUELINE 


(3) Hence, the representation of our facts in Lisp is 


JOHN 
ra -- MALE 
i ~~ 69.0 


WIFE -- JACQUELINE 
HEIGHT <-- (1,77 m) 


(4) In fact, the property list of 'JOBN', which is the CDR of 'JOHN' 
in Lisp 1.6 {MAC M 313}, is 
(SER MALE AGE 69.0 WIFE JACQUELINE HEIGHT (1.77 m) ...) 


(5) If later we want to know the age of John, we will ask 


(Get (quote John) (quote Age)) 
and the value will be 69.0 


* whis paragraph, which can be skipped if it is known what a 
property list is, will make the next section clearer. 
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FORMAT OF SCENE RS xCOR 

R3 YCor 
REGIONS {B06 285 X83 x2 501 Kee 297) 

VERTICES {3h BRL RK XJ AE XH BG OXF XE XD RC XO 

Bad «IMD 
@ACKGROUND 4087) TYPE 

286 nt 
NEIGHBORS (B45 BES 487 337) xcoR 
KVERTICES (aK 3 ZN XJ) YCOR 
FoOP (U985 ZK 283 BM ROT KN RET RID) 

ass 
NEIGHBORS Cu83 £26 R97 X87) Kind 
KVERTICES {3K BJ RI SH) TYRE 
Foor CANES BK 4OG RI NB7 XT KEP? BHDD an 

ass xcor 
NEIGHBORS (337 407 K86 385) 

KVERTICES (a BM BK MH) 
Foor C(R87 BL 287 BM KOO RK BAS BHD) 

282 KIND 
NEIGHBORS (284 R87 487 281) TYPE 
MVERTICES (af 3E 20 XC) x6 
Foor C606 SF RC7 XE BOP ED ROL XCdd uGOR 

ust Yoor 
WE LGHBORS (286 282 £87 487) 

VERTICES (26 BO 2B XA) 
Foor ((864 BC 282 BD 237 KO B87 SAD) KinD 

. iT} TVPE 
WEIGHBORS (202 292 287 207) ae 
VERTICES (BC 3A 3G RF) xtor 
oop ((RO2 BC ROL BA RET RG BST FD) voor 

87 
NEIGHBORS (286 £06 X83 RSI KES BOS KOZ N82 NIG TES 

Bsa R04) RIND 
RVERTICES (50 20 ZL RH MI BJ AE RF BG BA KU XD) Tyee 
Four (LEO BN 486 EM EOS ZL NOD XH XES ZI 168 xe 

20 oE-aF—-06—-5e—28-e) (E82 SE gt2 nF X84 5G 484 ZA x82 2B 28 xGOR 

a 302) YoOR 

rT) 
xcor 
YCOR Kind 
MVERTICES TYPE 
MRES IONS (E87 286) 3b 
Kino (397 &J B56 XM) xcor 
TPE tL (mee Revd) 

an 
xcoR 80-8 
Yoor 9.70908000 KIND 
MYERTICES (26 38 Eu) Type 
MREGIONS (507 386 593) 2c 
RIND C¥eF BM Ze6 KK Bez AL? xCOR 

TIPE CARROM (EK BH EN EL SSG ESF NTE yoor 
SL 
ACOR 7.1980066 
yoor 33.200000 aiND 
WYERTICES (2% SH) TYPE 
WREG IONS (387 48d) se 
Kino (507 Sm BAD ZH) ucoRr 
TYPE te (583 28702 YCOR 
aa 
xCOR 10.6 wine 
Yoor 14,e090909 TYPE 
WVERTICES (25 £0 Xm) mA 
WREGION® = (B86 25S KE) xcor 
wine (386 XJ 205 Be x83 3) ycoa 
TYPE (FORK 3K) 
WIND 
TYPE 


R 3 


a 


vGon 
NVERTICES 
NREGIONS 


NUERTICES 
WREGIONS 


WVERTICES 
WREGIONS 


NVERTICES 
WREGIONS 


Yeor 
QVERTICES 
WREGIONS 


NVERTICES 
WREG IONS 


WVERTICES 
WREGIONS 


NVERTICES 
WREGIONS 


TABLE 


IN INTERNAL FORMAT 
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MVERTICES 
WREGIONS 


WVERTICES 
WREGIONS 


11.799909 

12.600006 

Zl 2K RN) 

{287 BSS 486) 

CBs7 Bl 22S SK 336 BMD 

(ARRON (AK AD Rl Bw BES B86 297d) 


6.8 


16, 

(3H Bd? 

(a87 208) 

(387 Sn 28S BJ) 
(, (ma8 Be70) 


6.8998990 

35.399900 

(BL 8K BID 

(307 B23 238) 

(Bs7 BL Red 2K BsS BL) 

(ARROW (ZK BH MH ME ASS B35 387d) 


6.6000008 
51990999 

(2F Rad 

(387 Bae) 

(R87 RP BEd XA) 
{L (606 B87)) 


t 

(287 B82 Rsey 

(War BE Ze2 AC X36 2X6) 

1T (46 RF £6 XE B92 284 2872) 


(ze? B82) 
(x07 BO at2 3F) 
(h 1882 4379) 


4.8 

9.1499999 

(20 26 RE} 

¢B87 82 222) 

(Ks? BB Mel XC a2 5ED 

(ARROW CEC 50 ZB BE 481 242 5873) 


4.8086000 
8.5 


(af 30 3a) 

{334 Bee B84) 

(8e6 BF X32 2D B81 xa) 
FORK BC) 


“S.0908080 


(ka 8D) 

(a3? Baip 

(897 3a 38% xD) 
(h (ued mayan 


geaeeuons 
(36 SC 28) 

(Ee? Bee %05y 

(Be? 36 Be6 3C ast xe) 

(ARROW (AC BA XS RO 90 any 2879) 


INTERNAL FORMAT 


The program assumes the scene in a special symbolic format, 
which basically, is an arrangement of relations between vertices and 
regions, which are represented by atoms having adequate properties 
in their property-lists. 

A scene has a name which identifies it; this name is an atom 
whose property list contains the properties 'REGIONS', 'VERTICES', 
and 'BACKGROUND'. For example, the scene R3 (see figure R3) has the 
name 'R3'. In the property list of R3 we find (see also table R3 

INTERNAL FORMAT } 


REGIJONS (486 455 £483 222 481 K84 487) 


Unordered list of regions 


composing the scene R3. Order is imndleriel 


VERTICES (ZN £4M AL XK AJ KT XH 4G AF RE XD XC XB OAA) 


Unordered list of vertices 
composing the scene R3. 


BACKGROUND (437) 
Unordered list of regions 
composing the background of 
scene R3. 
Reston A region corresponds to a surface limited by simple closed curves. 


Regions are represented by atoms that start with a colon (:). For instance, 
in R3, the surface delimited by the vertices K J NM is a region, 
called :6, but DE FGAC is not, 


Each region has as name an atom which possess additional proper- 
ties describing different attibutes of the region in question. These 
are 'NEIGHBORS', ‘KVERTICES', and 'FOOP'. For example, the region in 
scene R3 formed by the lines DE, EF, FC, CD has ':2' as its name. 

In the property list of :2 we find: 


NEIGHBORS (X34 437 “437 KEL) 
Counterclockwise ordered list of 
all regions which are neighbors to 
:2. For each region, this list is 
unique up to cyclic permutation. 
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KVERTICES (2F %E *D XC) 
Counterclockwise ordered list of 
all vertices which belong to 
region :2. This list is unique 
up to cyclic permutation. 


FUOP (4484 £2F X87 KE 487 RD K81 %C)) 


Each sublist is a counterclockwise 
ordered list of alternating 
neighbors and kvertices of :2. 
Each sublist is unique up to cyclic 
permutation, and indicates a 
simple boundary. 
Each sublist of the FOOP property of a region is formed by a 
man who walks on its boundary always having this region to his left, 
and takes note of the regions to his right and of the vertices which 
he finds in his way. 


As other example, in the property list of :7 we find: 


NETGHBORS (436 486 £223 483 485 485 “4E2 K82 X34 X84 
431 421) 

KVERTICES (4N %M AL ZH KL ZI ZE KF AG 4A XB 4D) 

FOOP (6436 ZN £26 AM 483 AL ASS KM 435 KI K35 


uJ) (x22 %E 282 xF 484 x46 x84 XA 482 x8 x81 D)) 


pbasss§ A vertex is the point where two or more lines of the scene 


meet; for instance, A, G, and K are vertices of the scene R3. Each 
vertex has as name an atom which possess additional properties des 
cribing different attributes of the vertex in question. These are 
'XxcOR', 'YCOR', 'NVERTICES', 'NREGIONS', 'KIND', 'TYPE', and 'NEXTE'. 
For example, vertex J (see scene R3) has in its property list: 


XCOR 11.799999 
x-coordinate 


YCor 12.600000 


y-coordinate 
NVERTICES (241 %K “N) 


Counterclockwise ordered list of 
vertices to which J is connected. 
Unique up to cyclic permutation. 
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NREGIONS (%37 485 %86) 
Counterclockwise ordered list of 
regions to which J is connected. 
Unique up to cyclic permutation. 


KIND (¥37 41 285 4K X86 XN) 
Counterclockwise ordered list of 
alternating nregions and nvertices 
of J. This list is unique up to 
cyclic permutation. 


TYPE (ARROW (4K XJ XI XN 485 426 437d) 


List of two elements; the first is 
an atom indicating the type-name 
of J; the second is the datum of J. 
To be explained in next section. 


(NEXTE) Vertex J does not have the indica- 
tor NEXTE in its property list. 

The KIND property of a vertex is formed by a man who stands at. 
the vertéx and, while rotating counterclockwise, takes note of the 
regions and vertices which he sees. NREGIONS and NVERTICES are then 
easily derived from KIND, by taking its odd positioned elements, and. 
its even positioned elements, respectively. 

NEXTE is a property that appears in certain vertices (none in 
acene R3); it will be explained in next section. 

The property TYPE is also put by the function PREPARA; it classi- 


fies each vertex into one of several types, as described in table 


'VERTICES' (next page). 
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a 


‘L'.- Vertex where two 'FORK'.- Three lines forming angles 
lines meet. smaller than 180 degrees. 
"ARROW'.- Three lines meeting at .- Three concurrent lines, two 
a point, with one of of them collinear. 


the angles bigger than 


180 degrees. NL 


'K'.— Two of the limes are "X'.- Two of the lines are collinear, 
collinear, and the other and the other two fall on 
two fall on the same side opposite sides of such lines. 


of such lines, 


=k 


‘PEAK'.- Formed by four or more 'MULTI'.- Vertices formed by four or 
lines, when there is an more lines, and not falling 
angle bigger than 180°. in any of the preceding types. 


TABLE 'VERTICES' 
Classification of rectilinear vertices. 
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TYPES OF VERTICES 


The disposition, slope and number of lines which form a vertex 
are used to classified it, task performed by the function 
(TYPEGENERATOR L) by storing in its property list its corresponding 
type. 

The TYPE ofa vertex is always a list of two elements; the first 
is the type-name: one of 'L', 'FORK', 'ARROW', 'T', 'K' , 'X', 'PEAK', 
'"MULTI'; the second element is the datum, which generally is a list, 
whose form varies with the type-name and contains information in a 


determined order about the vertex in question (see table 'VERTICES'), 
Vertices where two lines meet. 


L.- A vertex formed by only two lines is always classified as of type 'L'. 
Two angles exist at it, one bigger and other smaller than 180°. The 
datum is a list of the form 


(E, E2), where Ej, is the region which contains 
the angle smaller than 180°, 


Eq is the region which contains EB 
the angle greater than 180°, E 


For instance, in scene R3 (see fig. 'R3'). 


G has in its property list: 
TYPE (L (%:4 %:7) ) 


The vertices of type L present in R3 
are B, E,G,I,L,N. 


Vertices where three lines meet. 


FORK. - Three lines meeting at a point and forming angles smaller than 
180° form a FORK. 
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Its datum is the vertex itself 
at which the fork occurs. For instance, 
vertex Khas in its property list 

TYPE (FORK %K) 


The vertices of type FORK present 
in R3 are C, K. 


ARROW. - Three lines meeting at a point, with one of the angles bigger 
than 180°, 
The datum of an ARROW is a list like 
(E, E, E,E,E, E¢ E,) where 
E, is the vertex at the 'tail', 
E, is the vertex at the center. 
E, is the vertex at the left of E> E, 
E4 is the vertex at the right. 


E, is the region at the left. 
E, is the region at the right. 
E, is the region which contains the angle bigger than 180°, 

For instance, vertex H has in its property list 
TYPE (ARROW (*@K %H %L %l %:3 %:5 %:7)) --fig R3 
The vertices of type ARROW present in R3 are A, D, H, J, M. 


T. - Three concurrent lines, of which two are collinear. 
The datum for a T is a list of the form (E, Er Ey E, E, EY E, ), where 
E, is the vertex at the 'tail' of the T. 
E, is the central vertex. E3 
E3 is a vertex such that E, E, BE, is 
an angle between 90 and 180 degrees. 


Ey, is a vertex such that E, E, E, is 


an angle smaller than 90 degrees. 
; ; 1 
That is, E, E, Ey are collinear. E, 
Es is the region which contains the 


angle between 90 and 180 degrees. 
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Ee is the region which contains the angle smaller than 90 degrees. 
EL is the "central "region (where the 180° angle is). 

For instance, vertex F (fig. R3) has in its property list 
TYPE (T (%C ZF WG WE %:2 %:4 %:7) ) 

The vertices of type T present in R3 are F only. 


See also "Matching T's or Nextes "below. 


Vertices where four lines meet. 


K.- When two of the lines are collinear, and the other two fall in the 
same side of such lines, The datum is a list of the form 
(Ey E, E3 E, E, Es, E, Eg) where 


E, is the central region. Eg 


E, is the region having the 180° angle. 


2 
E, is the collinear vertex which falls 


to the left of E, E,. ED 


4 is the region to the left of E,?.E, 


Es is the vertex to the left of E,rE, 
Eg is the collinear vertex which falls to the right. 


E, is the region to the right of E,; +E). 
Eg is the other vertex to the right (of E)). gE, 
R3 contains no vertices of type K. PA of figure BRIDGE is of type 'K'. 
X. - When two of the lines are collinear, and the other two 
fall in opposite sides of such lines. The datum is a list of the form 
(E, E, E, E, E, Ee), where 
E is one of the collinear vertices. 
E, is the region to the left of Ey Cc, 
where C is the vertex at the center. 


E, is the region to the right of E, C, 


1 
E, is the other collinear vertex. 
E, is the region to the left of E, Cc. 


Eg is the region to the right of Ey, Cc. 
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For instance, we find in the property list of F 
(figure BRIDGE): 


TYPE (X (QA:26 :22 G :21 :30)) 
‘The vertices of type X present in BRIDGE are F, only. 
The datum for an X may also be in the form (EB, E, E, E, E, E,). 
Vertices of four lines which are not of type K or X are either of 
type PEAK or MULTI. 


Other types of vertices. 


PEAK. - Formed by four or more lines, when there is an angle bigger 
than 180°. 


PEAK Ey 


E3 1 MULTI 


MULTI. - Vertices formed by four or more lines, and not falling in any 
of the preceding types, belong to the type MULTI. R3 contains 
no PEAKS or MULTIS. 

The datum for vertices of type PEAK is of the form (E, E, E,), where 

E. is the region that contains the angle bigger than 180 degrees; 


Ey is the vertex before E,, and E, is after (in the R3) sense), 


2’ 
The datum for vertices of type MULTI is of the form E,, where 


E, is the vertex itself, 


1 


NEXTEs or Matching T's,Two T's which are collinear and facing each other 


(see figure) are called "matching T's, and each one is the "nexte"™ 
of the other. The indicator "NEXTE "is placed in such vertices. 
If the region E, of a T (see figure} is the background, that 


T can not be a matching T. 
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In the figure, E, and F, are matching T's because E,-E, is 
colinear with Fo-Fy- It is not required of E,-E, to be parallel. to 


F,-F,- If several pairs of T's are possible, the closest is chosen: 


P Q R 
P - Q are matching T's, 
and not P - R. 


The matching T's will get involved in the determination of places 


where a body is occluded by another object and later emerges visible 
again. 
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For two T's to be NEXTEs or matching T's, it is required that 
neither E, nor Fy be background The cequdrement ‘should be extended to 
all regions between E, and Fo since. a- Line an ‘not go "under" the 


background region: 


Two straight Hnes seg. intersect. aa eee sth 
infinity); a way te. detect these background regions E 3 " 
is to write functions (subroutines) that find sot if: two" ‘segues of 
line intersect, or if one metet ‘integeects with a line,” : 


_ 


LINES AND SEGMENTS 
In the plane, two straight lines always meet. 
Two segments, or a line and a segment, may or 
may not meet. (4 segment is a finite portion of a lint )- 
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THE PROGRAM 


We now describe SEE, and how it achieves its goals, by discussing 
the procedures, heuristics, etc., employed and the way they work. 


We begin with several examples. 


Example A. Scene 'TOWER'. This scene (see figure 'TOWER') is 
analyzed by SEE, with the following results: 


RESULTS 

(BUDY 1. 15 2 83 81) 

(SULDY 2. 1S 235 35 34) 

(BOUY 3. IS %23 817) 

(BOLY 4. 1S %6 27 38) ; Results for scene TOWER 
(BULUY 5S. iS $30 %11 39) 

(SOLY 6, IS 43 844 342) 

(BULY 7. IS 818 222) 

(8OLY 8. 18 829 219 221) 


Example B. Scene 'MOMO'. Details of the program's operation are 
given. (skip to next page, if you wish). 


&Z SL SEE 1 Go to DDT and load file SEE 1 (in tape 
p>) 
GUZMAN F), a binary dump of the program 
’ SEE. 
$G Start. 


(UREAD MOMO S1 3) 7Q Read the file MOMO SI (in tape GUZMAN C) 
from tape drive 3. 


(PREPARA MOMO) Convert MOMO from its Input Format form 
to Internal Format, the proper form that 
SEE expects. 


(SEE (QUOTE MOMO)) Call SEE to work on MOMO. 
Results appear in next page. 


Notes: fZ (control 2) is keyed by striking the ad key while holding 
down simultaneously the CONTROL key. (Memos i6/,6% M9) 
~ denotes carriage return. 


$ denotes the character "alt. mode". (see alse instrchens in fisting) 
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SEE 58 ANALYZES MOMO 

EVIDENCE 
LOCALEVIDENCE 

TRIANG 
GLOBAL 

((NIL) (0838) 60046 GO043 GO041 GOD40) (1819) 60046 GO045 Goetc 
LOCAL 

(LOCAL ASSUMES (317) (39) SAME BODY) 

(LOCAL ASSUMES (29 817) (818) SAME BODY) 

LOCAL ; 

CONTL) (NIL) €(80)) (NIL) (NELD (NILD (0898 837 $39) GO04S ec. 
LOCAL 

(4(83 82 81) GOOBL GOOZ9 GOO3N GOOZB) (1832 833 827 326) Goekk- 
LOCAL - 

SMB 

RESULTS 

(BODY 1, 16 83 82 33) 

(BODY 2, 18 832 833 827 #26) 

(BODY 3. 18 828 #33) - 

(BODY 4. 16 820 834 819 830 229) 

(BODY 5. 18 836 335) EPSvLEs See ae 
(BODY 6, 18 824 85 821 24) 

(BODY 7. 18 825 %23 %22) 

(BODY 8. 1S 844 813 845) 

(BODY 9. 18 310 336 811 812) 

(BODY 10. 18 837 818 806) 

(BODY ile 1S 87 88) 

(BODY ia2e 18 838 837 839) 

NIL 


Most of the scenes contain several "nasty" coincidences: a vertex of 
an object lies precisely on the edge of another object; two nearly 

parallel lines are merged into a single one, etc. This has been 
done on purpose, since a non-sophisticated pre-processor will tend to 


make thig kind of error . 


Example C. R3. Analysis by SEE gives 


(BODY 1, 18 “82 X81 x84) 
(BODY 2. IS %86 %35 %33) RESULTS FOR 'R3! 


The %aign indicates the dextral scenes (cf. page 233), The signs 
may be ignored, 
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PARES ES The program is straightforward; it does not call 


itself recursively; it does not do "pattern matching"; it does not do 


tree search. It is formed by several main parts, sequentially execu 


ted. They are 


LINKS FORMATION. An analysis is made of vertices, regions and asso- 
ciated information, in search of clues that indicate that two 
regions form part of the same body. If evidence exists that 
two regions in fact belong to the same body, they are linked 
or marked with a "gensym" (both receive the same new label) .* 
There are two kinds of links, called strong (global) or weak 
(local). 

Some features of the scene will weakly suggest that a group 
of regions should be considered together, a% part of the same 
body. This part of the program is that which produces the 


‘local’ links or evidences. 


NUCLEI CONSOLIDATION. The 'strong' links gathered so far are ana- 
lyzed; regions are grouped into "nuclei" of bodies, whith grow 
until some conditions fail to be satisfied (a detailed explana- 
tion follows later). 

Weak evidence is taken into account for deciding which of 
the unsatisfactory global links should be considered satisfac- 
tory, and the corresponding nuclei of bodies are then joined to 


form a gingle and bigger nucleus. 


BODY RETOUCHING. If a single region does not belong to a larger 
nucleus, but is linked by one strong evidence to another region, 
it is incorporated into the nucleus of that other region. If 
necessary, more nuclei consolidation could be done after this 
step. 

A last attempt is done to associate the remaining single 


regions to other bodies. 


The regions belonging to the background are screened out, and the 
results are printed, 


* In LISP, a"gensym "(generated symbol) is a new Atomic symbol, 
previously unused, 
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Auxiliary Routines 


Three functions are used constantly, and will be described now. 
ee “Through a chain of T's." Allows properties or configu-~ 
rations to extend along straight lines; for instance, the property 

‘ A ; eg 
«'A' has as neighbor an Lb >7 ———7 ‘dan "be éxtended so as 
to say «throughtes, 'A' has as neighbor. -ax tL). 


—___——, | TOTES 
schematically represented as rm m7 
Strict definition. —j;— ia definéd as. one ‘of 


(1) ‘° (meaning the two vertices in both sidés of +} are in 


fact. the ‘same). 
Sh Fae 


matching T's 


@Q ap. 
— dC ee 


Example A : ® ee also ‘annotations on Listing. 


GOODT rf a vertex ¥ is considered a "good T", (GOODT V) ‘is TRUE; 
false otherwise. 
(GOODT V) = ‘tf 


if 


F 
F 
Tt if 


aralle 
¥ if ye" Ne 


T otherwise. 
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As we set, this function tries to distinguish between T's originated by occlu- 
sion, such as 0, and T's originated by accident (A). 


=e "Not same body." Acts as a link inhibitor. 


If consulted, (NOSABO .. V ..) will inhibit, in the following condi- 
tions, the link that vertex V may have created: 


wo tp ay—/ 


linhibited link (prohibited, ignored, forbidden, not 


created) 
v 
(ie MA 
(4) PEAK 


(5) 4 NE 


Nosabo tries to find conditions indicating that two regions should 
not be considered as part of the same body; hence, if consulted, 
Nosabo may forbid a link among them. Some heuristics place links 
without asking Nosabo's approval and Nosabo can not "erase ''a link 
placed without its authorization, 

If none of conditions (1) to (5) is met, Nosabo will be False, 
indicating no inhibition was found, and it is up to the program that 
asked Nosabo's opinion to lay or fail to lay the link in question. 


82 


We proceed now to explain in considerable detail each of the parts 
of SEE. This will help the reader to understand the behavior of 


the program, its strengths and deficiencies. 


LINK FORMATION 


Several subroutines are devoted to creating weak and strong 
links. See also Listing. 


CLEAN Removes several unwanted properties. 


EVERTICES, Each vertex is considered under the following rules: 


L.- No evidence is created directly by this type of vertex. 
Nevertheless, the "L" is used in many combinations 
with other vertices to account for evidence. As we 


saw, Nosabo uses L’s, "Legs" will use them, too. 


FORK.- == No link iscrted if any of the three regions is 
background (but see below). 
Example (unless otherwise indicated, all examples 
are from figure 'BRIDGE' page 94): Vertex J 
does not generate links. 

== Otherwise, three links are creded as shown, except 

that each one may be inhibited by Nosabo. 
Example. Vertex JB only produces link :5-:8. 
Link :5-:9 is inhibited because S is a 'T'; Nosabo 
also forbids link :8-:9 because KB is an ‘arrow’. 


This last rule is the most powerful of the heuristics. 


Two links arecreated as shown, without asking Nosabo, 
if the fork is connected to the central line of 
an arrow. (No link js put here” a7 ) 

Example: In fig. R19, PA generates links :29-:17 
and :35-:17. 

This fast heuristic is of help where there are concave objects (Fig. R19). 
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ARROW .- == ,Link if an L is connected to its central line, 
and the region shaded contains only that arrow 


as a "proper-arrow,” and no Forks. 


Region :1 contains arrow A oe he 
" 5 7% 
as a “proper-arrow''; also 


region :2, but not region :3. Capisce? 
Example. BB links:10 with :4. 
Aliows "lateral faces" of legs to be properly 
identified and agglutinated. 
== Otherwise, link except if inhibited by Nosabo. 
Example. D lays a link between :26 and :23. 


Pewerful and general heuristic, 


X.- == No link if the X comes from the tntersection 
of two lines. 
Example. G originates links :26-:22 and :21=-:30; 
this last one will later be erased or disregarded, 
since :30 is the background. 

K.- we No link. 


4 == Otherwise, link as shown except if Nosabo disagrees. 


== Links are established between contiguous régions, 
exeept those to the region containing the angle 
bigger than 180 °. These links are subject to 
Nosabo inhibition. 
Example, In fig. 'CORN', JJ generates links 
:8-:9 and 9:10. 
Of certain use, spe¢ially with pyramids and 
"pointy" objects. . , 


MULTI.~ . ee No link. 
The reason is: 
(1) i£ the vertex ig "genuine" (cf. page v4), 
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= ae ye eee 


although it qenenes no links, the object 
having it will probably possess many 
other vertices, through which links 
will get established, and 
(2) if the vertex if "false" because is the 
result of the casual coincidence of two 
or more genuine vertices, mistakes are 
avoided by abstaining of generating links. 
This is generally the case. 
An improvement is possi- 
ble, by allowing MULTI 
vertices to place links. 
If matching T's, link as shown, without consulting 
Nosabo, Avoid linking to the background, 
Each pair of matching T's produces these links 
only once; that is, we do not produce two links 
while analyzing A and another two at B. 
Do not link if the middle region of a 'T' is the 
background. 
What we are trying to do here is to find places 


where a body appears as two disconnected parts. 


Link (without Nosabo's consent) as shown if the 
central segment of the 'T' separates two non- 
background regions, and these have the background 
as neighbor, and part of the separations between 
background and no-background are parallel to the 
central segment of the 'T'. 

Avoid double links in the following case (link 


just once): 
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Ne background 

A SEES pacrground \ 
ee 
Example. TA links :21 with :27 (F-G, 
RA-TA and JA~IA are parallel). 
Favors occluded bodies with parallel faces. 
Also, see "STUDY" in listing, still an 
experimental feature. 
Two links are placed as shown (without asking 

Nosabo) if the central line of the T is 
connected to the central line of an arrow. 


It 1s of help where there are concave objects. 


Table ‘Global Evidence' shows compactly the main rules just discussed. 


LOCALEVIDENCE 


Weak or local links are laid here; they are used to 


indicate, in a feebler way, that two faces or regions may be part of 


the same object. 


Nosabo can not inhibit local links. 


A weak link is placed as shown (dotted) if, 
Throughtes,an L is connected to an Arrow, 
and the two indicated edges are parallel. 
We call this configuration 'Leg'. 

Example (all examples from figure 'BRIDGE', 
except if counterindicated). Vertex FA is 
a Leg (FA =- QB is parallel to EA - DA) 
that links weakly :18 with :19. 


In a Leg, if there are two matching T's as 
shown, a weak link is placed correspondingly. 
Example. In fig. 'TRIAL' (page 88), a weak 
link or evidence is piaced between :7 and :4, 
because EE is a Leg, and L and E are watching T's. 
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The heuristics described will sometimes produce a “wrong linkage," 
linking two regions that do not belong to the same body, These mistakes 
are not likely to confuse SEE, since the handling of these links (and 
all of SEE, in general) isa done under the ageaumption or knowledge that 
the information is noisy and somewhat unreliable. . 

Strong links are shown dotted; weak links are not shown, 


| 
(A) (c) 
a | 
oO, 
73 an 
(D) (E) (F) 


(G) (H) (1) 


TABLE ‘GLOBAL EVIDENCE’ 
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TRIANGLE .- == A Triangle is a 3-vertex region, of which 

two are interconnected T's, the type of the 

other vertex being irrelevant. 

Two triangles are weakly linked if they are 

(1) “facing each other, and 

(2) “properly contained", meaning that 'D has 
to fall on the same side of AB as C does, 


and similarly for the other vertices, and 
(3) AB is parallel to EF, and AC to DE. 
The heuristic helps with faces of a prism 
that is badly obscured. It does not help 
much, since it gives only a weak link. On 
the other hand, this weakness prevents mis- 
takes when the two triangles are not from 
the same body. 
A possible improvement ; 
consists of choosing the closest of two 
triangles, if several candidates are possible. 
Example. In figure 'WRIST' (page J26), weak 
links are placed between 


triangles 5 and 6, and 

between 1 and 2. 
Example. Figure 'TRIAL' receives the 
following strong links (full lines) and 
weak links (dotted lines) 


FIGURE TRIAL 


(BODY 1 I8 $6 :2 :1) 
(BODY 21S :11 :12 210) 
(BODY 3 IS :4 :9 :5 :7 :3 :8 :13) 
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FIGURE ‘TRIAL 


The links could be represented as 


Figure 'TRIAL - LINKS' 


Strong (solid) and weak (broken lines) 
links of figure 'TRIAL'. 


SEE prints these links in the following way: (cf. also p. 110): 


sed 
fo :11 has four links emanating from itself. 


CONIL) (C821) GOO14 GO013 GOO11 Goo10) ¢ 
(212) GO01S GO014 GoGi3 GOd12) (1313) Ga 
021) ((89) GOO22 GOOZ! 60020 GoG19 GO017 
GO016) (1230) GOO1S GOG22 GOd11 GO010) 

(033) GO034 G6G025 GOoo24) 1184) GULS3 G00 
$2 G0026 vwuv2S GOO]3) (186) 39031 60030 

GOS29 GOG27) ((:5) GO926 GuoeS GEG22 Goo 
18 GOO17) (127) GUUSS VOUS]? GOULS 60018 

GOG16) 1688) GIIS6 CUIZ4 GOO29) (1t2) GO 
035 GOC31 GOOZ9 GOO28) (1214)) Ch8t) GOO 
35 GOG30 GO026 G0027)) 


Strong Links of ‘TRIAL 


Weak links of scene 'TRIAL' are 


(39 25) 


((32 1) (86 82) (86 81) (84 25) 
(24 87) (39 87 


(323 39) 183 88) (29 38) 
») (812 810) (811 232)) 


2 there is a weak link between 3:12 and :10 


Hes Weak links of ‘TRIAL. 
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The next step is to gather all this evidence and to form tentative 
hypotheses of objects as assemblages of faces with many links among 
them. 


NUCLEI CONSOLIDATION 


All the links to the background are deleted, since it can not 
be part of any body. 

Strong and weak links exist among the different regions of a 
scene. They are consolidated in that order by two subroutines, 
Global and Local. 


GLOBAL 


Groups of faces with an abundance of strong links among them 
are first found; these "nuclei" will later compete for other faces 


more loosely linked. 


Definition: a nucleus (of a body) is either a region or a set of 


regions that has been formed by the following rule. 

Rule: If two nuclei are connected by two or more strong links, 
they are merged into a larger nucleus. 

More detailed rules appear in page 25 , in section 'Simplified 
view of Scene Analysis’. 


For instance, in the figure below, regions :1 and :2 are put 


cat +> ff. 1 Q 


Fig. cer 
Two links between two nuclei merge them. 


together, because there exist two links among them, to form nucleus 
31-2. Now we see that region ;:3 has two links with this nucleus :1-2, 
and therefore the new nucleus :1-2-3 is formed. 

We let the nuclei grow and merge under the former rule, until 


no new nuclei can be formed. 
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When this is the case, the scene has been partitioned into 
several "maximal" nuclei; between any two of these there is at most 
one link. For example, figure 'TRIAL-LINKS' will be transformed into 
figure 'TRIAL-NUCLEL'. 


al ec 


Figure 'TRIAL - NUCLEI’ 
Maximal nuclei of scene TRIAL. 


ae If some strong link joining two "maximal" nuclei is also 
reinforced by a weak link, these nuclei are merged. 

The weak links of figure TRIAL are shown as dotted lines in 
figure 'TRIAL-LINKS' (page 90); they transform figure 'TRIAL-NUCLEL' 
into figure 'TRIAL-FINAL'. 


| © 
ae 


Figure 'TRIAL - FINAL' 
Nuclei of scene TRIAL after merging 
suggested by local links. 


BODY RETOUCHING 


Additional heuristics assign unsatisfactory faces to existing 


nuclei, or isolate them. SINGLEBODY and SMB are used for this task. 
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snc A strong link joining a nucleus and another nucleus composed 


by a single region is considered enough evidence to merge the nuclei in 
question if there is no other link emanating from the single region. A 
message is printed indicating these merges. 

Such rules produce no change in fig. "TRIAL-FINAL', and there- 
fore its nuclei will be reported as bodies. 

A more complex example shows the retouching operation. Figure 


"BRIDGE' undergoes these transformations: 


Scene BRIDGE Fig. BRIDGE 


a] a 
2] 8. 
Hq 
: ai] gs 
n28V 8348 
fl Weak and strong links among regions Fig. 'LINKS-BRIDGE' 
a 
aa 
= net 
< ao 
et 2 8 
a Ss 
“Maximal nuclei 
e (2 or more strong links) Fig. 'NUCLEI-BRIDGE' 
Ay 
ra r 
i a 
fe) 
- Maximal nuclei enlarged 
" by weak link action Fig. 'NEW~NUCLEI-BRIDGE' 
° d 
i= 
| 
H .c} 
Wie 
4 HD 
« na 
isi Id. enlarged 
o by single undisputed regions Fig. 'FINAL-BRIDGE' 
n 
oo out: 
A oe 22) 8 
° a oO a 
An a 
ce aga 
fe es 
Id. enlarged 
by good neighbors, "goodpal". Fig. 'FINAL-BRIDGE' 
Final result. (no change in this 


case). 
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FIGURE 'LINKS-BRIDGE' 
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We see that in figure 'NEW-NUCLEI-BRIDGE', nucleus :16 is merged 

by SINGLEBODY with nucleus :18-19 (see figure 'FINAL-BRIDGE'). Nucleus 

:28=29 is not joined with :26-22-23 or with :24-25-27-12-21-9. Even if 

nucleus :28=-29 were composed by a single region, still will not be 

merged, since two links emerge from it: two nuclei claim its possession. 
This rule joins single regions having only one possible "owner" 


nucleus. 


ae. Two systems of links are used by SEE. One consists of weak and 
strong links, produced by examining each vertex, and culminates forming 
nuclei under GLOBAL, LOCAL, etc. 

The second system constitutes a different network of links; SMB 
works in the second system. It is motivated by the desire to collect 
evidence not directly available through the vertices. It gathers 
evidence from the lines or boundaries separating two regions, in an 
effort to answer the question: Are two given neighboring regions part 
of the same object, or are not they? That is, are two contiguous regions 
"good neighbors" ("good’pals")? If they are, a special link, s-link, 
is placed, eventually forming a network independent of weak and strong 
links, that will collapse, in a somewhat peculiar way. Thus, a great 
amount of unnecessary duplication could be possible in the information 
carried by both systems of links. To reduce it, the s-links are designed 
to complement and extend, rather than to re-do, the agglutination 
produced by weak+strong links. They (the s-links) will, therefore, mainly 
study single faces not satisfactorily accounted for. 

SMB uses the predicate (GOODPAL R S), which acquires the value T 
(true) if R and S are two contiguous "good neighbors" regions. 

To satisfy this, their common boundary must not be empty, and must 
lack L's, FORKs, ARROWs, K's, X's, PEAKs, MULTIs. In addition: 


R == Not good: (GOODPAL R S) =F 
Ss 
a B == Not good: (GOODPAL RS) =F 
z a fe! 


"LL" or (in general) vertex that makes 
(NOSABO R S) to be true. 
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=e Q, K. otherwise: (GOODPAL R S) = T. 


In particular, 
an 
$s 
is O. K. if (NOSABO RS) = F. 


SMB analyzes the nuclei formed under weaktstrong links that, after 
SINGLEBODY actuation, still remain formed by a single face or 
region. The steps are: 
1. A network of s-links is formed by putting a s‘link between regions 
forming a nucleus all by themselves, and their goodpal neighbors. 
2. If exactly one nucleus is s-linked to one of those regions (that 
is to gay, if such single-region single-nucleus has precisely 
one good-pal), the region gets absorbed by the nucleus; otherwise 


the region is reported as a body in itself (consisting of a single region) 


Cin) —> 


Om] —> — () 


(‘2) & does not change becamee :3 has two s-links. 


a. The s-links are not used to form nuclei as the weaktstrong links 


Note that 


were; they only help certain isolated faces to join bigger 
structures. 


b. Two s-links between two regions have the effect of one. 


Example. In figure 'HARD', regions :6 and :7 get joined by SMB. 
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SEE 58 ANALYZES HARD 
EVIDENCE 
LOCALEVIDENCE 

TRIANG 

GLOBAL 

CONTR) (08340) ((86)) (68360) 6(824) 60026 GO025 GO023 Goetc. 
0044 60043 GO042) (1817) 6O047 GO046 GO04S GO044) ((87)) ete. 
0041 G00359) ((821) GO050 GoN40 GO03S GO029 Gonz8 60027) ¢ -- 
0038 60036 GO019) (4826) GOOS4 GOOS3 GO037 GONS6) (1827) -.- 
60055 60023 GO020 GO015) ( (832) Go057 60056 GO034 60033) 

8 60048) ((84) GOOSe8 GO048) (1840) GOOSY GOOSZ GOO31) «C8 
#19) 60064 60063 GO062 GO061) (£820) GN064 GOO62 GO060 GO. 
330) G0056 G0035 GO033 60016) (¢815) 60066) (1816) 60066) 
CONTLD CO834)) 60860) 608560) CNIL) CNTR) ONILD ONIL) (08° 
019 GO0S3 GO036 Go0s4 60038 G0037 GOOL9) (NIL) (4824 822 
0040 60039 60029 Go028 60027 GOdes Gou022 Go0SS GOO23 G0G2 

) (NIL) ((85 84) 60048 GOOSS GONMS) (NIL) (¢823 817 814) | 
218 319 820) 60060 G0064 GO063 G006% GU064 GO062 60060 SO 
832 331 £30) 60033 30057 60034 60056 GO035 GO033 Go016) ( 


LOCAL 

(LOCAL ASSUMES (811) (212) SAME BODY) 

(LOCAL ASSUMES (8135) (8146) SAME BODY) . . 
CONTR) 008349) 60869) 00836)) (NIL) (NIL) (870) NL) (N 

O19) ((324 822 83 823 821 328 829) GOD020 60026 GO025 GOC4 

0055 G0023 GO020 GO015) ((81 82.233) 60052 60051-60017 GO: 

43 60047 GO0046 60044 80047 GON4S GOD043 50042) CNEL) (4818 
$10 88) 60032 SO032 S0065 GO0S9 GOO3S1 SOOSH) (¢832 331 8: 
) CNIL) €0835)) (0842 844) GO067) (NIL)? 

LOCAL 

(€(C3L2 $11) CGOO67) ((816 815) 60066) (1832 231 230) GOOS3 

GO065 GOO59 Goo3s1 So0s0) (¢218 219 220) GG060 GO064 GO063: 

6 60044 60047 G0045 GO043 GD042) ((35 34) GOO46 6005S God. 

3 821 328 829) 60020 60026 GO02S GO049 Good, GO021 GOO0SO ( 

15) ¢(€(825 826 827) GOO19 GC0S3 GO0036 GO0S4 Goose 60037 GOI. 

LOCAL 

sMB 

(SMB ASSUMES 87 36 SAME BODY) 

RESULTS 

(BODY 1. IS 8142 844) 

(BODY 2, IS 846 £15) 

(BODY 3. IS 832 831 830) 

(BODY 4. 18 8&9 810 38) 

(BODY 5, 1S 8238 319 820) RESULTS FOR HARD 

(BODY 6, IS 833 817 38414) 

(BODY 7. IS 3&5 34) 

(BODY 8 IS 8&1 82 833) 

(BODY 9, 18 824 2322 83 3823 #21 $28 229) 

(BODY 10. IS 825 826 3827) 

(BODY 11. JS 8&7 26) 

NIL 
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RESULTS. After having screened out the regions that belong to the 


background, the nuclei are printed as "bodies". 


In this procéss, the links which may be joining some of the 
nuclei are ignored: RESULTS considers the links of figure 
'FINAL-BRIDGE', for instance, as non-existent. These links 
are the result of imperfections in the heuristics, mistakes in the 
placement of links, and may point out different parsings. An 
improvement to SEE will be to try to "explain" these residual links. 


Sumer SEE uses a variety of kinds of evidence to link together 


regions of a scene. The links in SEE are supposed to be general 
enough to make SEE an object-analysis system. Each link is a piece 
of evidence that suggests that two or more regions come from the 
same object, and regions that get tied together by enough evidence 


are considered as "nuclei" of possible objects. 


Examples and discussion are in next section. 
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ANALYSIS OF MANY SCENES 


Until we have an adequate analytic theory, the behavior of a 
heuristic program is best understood with examples. There are 


several ways to go about this; 


Simple In order to learn what a program does, simple examples, each 
one illustrating a single feature or group of features, are very 
appropiate. 
zeverapie. A shiny impression of a set of routines is obtained by 
presenting 'favorable' cases, designed to enhance the characteristics 
of the program in front of the unsophisticated observer. 

Of course, of all possible inputs, there is a subset that will 
produce outputs very pleasant in terms of speed, easiness of pro- 
gramming, generality, accuracy, or whathever other feature that sys- 
tem advertises. This subset tends to get the highlights in the 
descriptions. 


Nasty Examples in which the program does particularly poorly are 


useful, if well chosen, to illustrate the weak points and pitfalls 
of the techniques used, the restrictions and constraints in the input, 
etc. They may point out improvements or extensions. 


a Examples having very weak connection with the purpose or 


intention of the routines or algorithms discussed serve no useful 

end, except perhaps to point out that the maker of such examples did 
not understand the issues. For instance, one could take a box full 
of pins, drop them on the table, take their picture and ask SEE to 


work on it. 


A collection of simple, favorable, and nasty examples follows. 


They are not in that order. 


A discussion is found at the end of this section. 
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Stereo Scenes 
ae Analysis of stereographic pictures will be found in 


the section 'Stereo Perception'. 


pending the becker curd Examples where the background is not known 


in advance and has to be deduced are given in the section 'Background 


Discrimination by Computer’. 
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LIST OF SCENES ANALYZED BY SEE IN THIS SECTION 


PAGE 


Comments. Scene (figure). Computer Results. 


107 
110 
113 
116 
119 
119 
123 
126 
129 
132 
135 
138 
138 
143 
146 
149 
152 
156 
156 
161 
164 
167 
170 
173 
176 
179 


108 
111 
114 
117 
120 
121 
124 
127 
130 
133 
136 
141 
139 
144 
147 
150 
153 
158 
159 
162 
165 
168 
171 
174 
177 
180 
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109 
112 
115 
118 
122 
122 
125 
128 
131 
134 
137 
142 
140 
145 
148 
151 
154, 155 
157 
160 
163 
166 
169 
172 
175 
178 
181 


8 R17 
—_ The three prisms are found. In scenes like this, the 


position of one or two vertices may alter the analysis made by SEE, 
by changing radically the slope-direction of a small segment (such 
as KL and GH, figure 'R17'), killing several T-joints and separating 
regions :1-2 from 15-6. 

Small errors in the coordinates of vertices K, L, G, H, and few 
others will drastically change the slope of segments of short length. 
This will transform G and K to be Arrows or Forks, so that G and K 
will no longer be matching T's (cf. also ‘Conservatism and Tolerance’ 
page 173). As a consequence, body :2-1 will be disconnected from body 
25-6. This annoying problem is not difficult to correct, at preproces 
sor level, since there is good information about the slope of the 
(long) line BN: the slope of KL has to agree with the slope of 
BN, giving a good estimate of its true shape. The 
rule seems to be that these short segments should be 
"re-oriented" if necessary, to agree with the longer ones, which are 
more reliable. Deeper analysis is found in section ‘On Noisy Input’. 

The preprocessor should consider the hypothesis SUGGESTION 
that BKLN are colinear -- or SEE should propose it 
for confirmation (see ‘Division of Work in Computer Vision’, p. 6° ). 


The % signs In the printouts of some scenes, such as R17 (see ‘RESULTS 
FOR R17' in page 1097), a % sign appears as part of the name of every 
region and vertex; that is, $:3 instead of :3. This will be the case 
in all scenes having names starting with the letter R, differentiating 
the "right regions" from the "left regions". This will become clear 
in the section ‘Stereo Perception’, page 233; until then, disregard 
the %'s. 
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RI7 


FIGURE ‘R11 7' 


The three prisms were correctly found. 
There are several "nasty" coincidences 
in this scene, simulating the data 

that a not-too-satisfactory preprocessor 
will tend to provide. 
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IN 
(Zee @ex oon 81 °C acos) 
£14 Bod STIsTa (2eX TeX oot GeX ot °z aco) 
(oex Sox St °T 4008) 
s.ns3u 
aus 
1v909 
(04009 13009 40009. Zto0s t4s% @ek @sX)) (eteos stons vos OTO0D tex Vek OFX GaXd) ETIND CLI009 (yee Cater FIND?) 
(AGOG BWYG (Zax) (Tex Osk 63%) SONNSEY AdoasIONIS? 
(AOR 2HVG (4%) (CuK) GIWNSSY ADOAZISNIS? 
17907 
CUCOTSMDD C4T009 (H9Kdd CTIND 
{£1009 (oeXd) (OTOOS (2sXI) (ETD0g E009. pT00s gf005 (tex osx gexd) (OT009 FIN0g g0005 ZtOUe tex OFX O29)9 CEN) CTINID 
CAdO@ awvG (GeX) (Tr O9%) Sawnsey EG aed 
v.30 
CUCH38Rdd CLTOOS tHekI) CvTOOS OT009 (GekI) 12 
TO0S (hE) LOTOOD (Bax)? (LTOOS STOOS HIO0S (tex G2xId (CAIN TOSOOS STOO 60099 ZTOOH (43K Bex Gexd) (TIN) COIN) CTINDD 
CECGT MDD C4T008 Cwekd) L¥tTO0S HtOOS (GeXd} CZIOOS (HeN)? t9THOS (Vx) (eto 


08 SI00S (aX)) (e100g HI009 StOdS (99%)? (OtoNS T8008. (29%)) (60008 OTC09 Zt00g (ee)? (60009 T1009 et00s (62%)? pal 
veo 


Onviws 
2ONIOLAZIVIO7 
JONZOLAZ 

Te SAZATVNY OS 228 
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Scene U3 Without difficulty, two bodies are found. Each region 


contains four strong links relating it with other regions (see 
"RESULTS FOR L3'). LOCAL is not needed to form nuclei; neither 
SINGLEBODY or SMB. 


Explanation of the printout produced by the program 1, page 112, a 


printout of the results appears. The format is the same for every 
scene. It starts by saying 

SEE 58 ANALYZES L3 
which identifies the name of the program (SEE), its number (version 
number 58), and the scene to be analyzed (L3). 


EVIDENCE 
LOCALEVIDENCE 
TRIANG 

GLOBAL 


The different sections of the program print their name, when they 
are entered. 
We then come to a list containing regions (such as :6) and ‘gensyms' 


(such as G0009): 


(CNIL) ((¢36) 60009 Go0O07 GooOSs GooO4s) (185) GO010 Go008 
GO007 GO004) (484) 60010 GO009 GO008 GON0S) (181) GO015 
GOO13 GOO12 GOO11) ((82) 60016 GO014 GODI3 Gool1) 

((83) 60016 Gou1LS 60014 GOO12) ¢¢87))) 


This list contains the nuclei and the links (strong links); the first 
nucleus that we see is ((#6) GQ009 GO007 GO005 GO004) , meaning 
that from nucleus (or region) 16 emanate four links, namely G0009, 
G0007, GO0005 and GOQ004. We can represent this graphically: 


Re 


We then see “LOCAL” (when this function is entered, it prints its 
name), then the list of nuclei again, this time shrunk somewhat by 


LOCAL; finally, we see "RESULTS", and then 2 bodies, follo- 
wed by NIL, meaning the end of the program. (See page 112). 
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SEE S& ANALYZES LY 

EVIDENCE 

LOCALEVIDENCE 

TRIANG 

GLOBAL : 

CONTE (486) GO009 GO007 GO005 Goons) ¢(( 8S) GOO10 GO008 GOG07 SOOO4) (484) GOG10 GOGO Goode GO005) (¢81) GOO15 GOO13 GO 
O12 COOL) (162) GOOL6 COGAd COOLS GOORHD (493) GO016 GOO1S GOO14 GOO1Z) 1487)4) 

CONTLD CNTY (NTL) (086 85 64) GO005 60008 60007 60004 GO010 GO00P GO008 GOOGS) (NIL) (NIL) CCB 82 83) GOOL2 GOO14 GOO1 
3 -GOOIS GO016 GOOIS GOO14 GOOL2) (¢87)0) 

LOCAL 

CONTE) (NIL) €(86 85 84) GOG0S GO008 60007 GO004 60010 60009 GO006 GOOOS) (NIL) ¢(¢(8% &2 83> GOO12 GOO14 60019 GOO1s COOL 
6 GOO§S GO014 GOO12) (8719) 


LOCAL 


C1C81 32 83) COOLZ GQO14 COOLS GOO1L GOOIs6 GOO1S GOOL4 GOO12) (186 8S 84) GOOOS GN00e S0007 GOOG GO0;I0 Goo09 Gonos GO0O 
$)) 

LOCAL 

$Me 

RESULTS 

(BODY 1, 1S 82 #2 83) 

(BODY 2, 18 86 85 84) RESULTS FOR IL 3 

NIL 


5 
scenes Iwo bodies are found in this scene. Vertex F is 


classified as of type 'T', hence only one link there exists between 
22 and 3:4. 

All scenes have regions, vertices and lines (edges) joining 
vertices and separating regions. We generally omit the names of the 
vertices from the drawing (figure 'R3'); we are also omiting the 
coordinate axes. 

Since each region has an inside and an outside, the following 


are invalid or illegal configurations in a scene: 


A line ending nowhere; illegal. 


Our sceneg should be such that, 
te disconnect a separate component 
of the graph into two components, 
we have to remove (delete) at least 
two edges. The graph above is 
"{llegal" as input to our program, 
since the criterion is not met: 
removing edge E will disconnect 
the graph (cf. page 37 ). 
Incidentally, some optical 
illusions are "recognized" or rejec 
ted becauge they come from illegal 
scenes of the type just described 
(cf, section ‘Optical Illusions’). 


See ‘Illegal scenes’, page 217, in section ‘On noisy input, ! 
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Scene __ SPREAD Body :41-42 was found; also :8-18-19. In the first 


case, there was one strong link between :41 and :42, because of the 
heuristic (g) of table ‘GLOBAL EVIDENCE' (page 87), and SINGLEBODY 
completed the object. In the second case, heuristic (g) could not 
be applied, and SMB had to join :19 with :18. 


Bodies :29-30-31-32 and :25-26~27~28 are adequately found. 
Also the badly occluded long body :10-9-11-12-3 is found. 


Body :21-6-25-20 ‘1s found as one body. An older version of 
SEE {Guzman FJCC 68} used to report two: 16-21 and :5-20. The 
change is as follows: one link is pleced between :6 and :5 because of 
the matching T's, the other link is a weak one plaed because +5 and 120 
form a LEG; a weak link is also ptaced between :6 and :5. 


324 gets reported isolated, instead of together with :22~-23, 
because no Leg is seen; but see comment (page 3°) in section ‘Sim- 
plified View of Scene Analysis'. 


SEE tries to find a "minimal" answer; minimal in the sense 
that it will try to explain the scene with the minimum possible num- 
ber of bodies (cf. section 'The Concept of a Body'). That is the 
reason which joined :41 and :42 in one body, instead of two, which 
is another possible correct answer. That is also true of :19-18-8, 
interpreted as one parallelepiped with a vertical face (:19) and an 
horizontal face (:18-8). 


The background of SPREAD is also computed (see page 226o0f section 
"Background Discrimination by Computer’). 
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SPREAD 


FIGURE 'S P REA D' 


Bodies :10-9-11-12-3 and :6-21-5-20 are properly found. Also is 
correctly identified the body :19-18-8, which is a parallelepiped 
with a vertical face (:19) and an horizontal face (:8-18). 
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* 
plc A Sal In both cases all the bodies were accurately 


identified by our program, which is written in LISP. In both cases 
the body :4-15-16 ifs found. 

These scenes show that in many instances one could drastically 
alter the position of a vertex, without modifying the output of SEE 


(compare figure 'STACK' with ‘STACK*'), 


Other examples would show that the vertices of type 'L' can be 
arbitrarily displaced, so long as their type remains 'L' and other 
vertices do not change type, without detrimental effect. This dis- 
placement may possibly affect some heuristics that use concepts of 
parallelism or colinearity, but not the rules that use. the shape or 
type of a vertex (cf. table 'VERTICES', page 63) for placing and 


inhibiting links. Read 'Misplaced vertices' in page 2\|, in section 
‘on noisy input.‘ 
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FIGURE ‘STAC K' 


Every body is correctly identified. Compare with scene STACK*. 
This pair of drawings illustrate the fact that it is often 
Possible to disturb the coordinates (the position) of a vertex, 
without introducing errors in the recognition. 
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FIGURE 'STACK*#! 
Every body is correctly found. Compare with scene STACK. 
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SEE 58 ANALYZES 11 (STACK, STACK*) 

EV] UENCE 

LOCALEVEDENCE 

TRIANG 

GLOBAL 

C(NIL) €(820) 69067 GO046) ((£19) G0047 GoUude) (185) GOOTG) (1813) GODS) 60040 GYDSO) (189) GOO5S GO054 GO053 Go039) «(8 
7) [0057 GUOS2) ((88) GOOS6 GOD5S GOOS4 GOUTH) (1811) GoObU GUOdY GuLd2 GL0SO) (486) GO059 Goode Goos7) 114810) G00SS GOO 
53 60039 GOOSE) (1812) GOD GO0S8 GOOSD) {184) GODCL GUDG9 GOUAZ GO041) ((314) GOOS1 GO0460) (13219) (081) GOO62 GOODas) 
((318) 6OU65 GO063 GUO4S 60044) (1216) BQUOE GUG4SS GOU4E 60037) (82) 6U066 GO064 GU063 GO062) ((831 Ga066 G00445) (49195) 
60069 GO043 GO042 GUDS7) 41817) GOO6S GO064 L0048 &GO0045)) 

(ONTLE CNILD €¢820 819) GO0046 GOU47 GOO46) (185) GOGSO) (NIL) (NIL) (487) GOOS7 GO0S2) (NIL) (NIL) (NIL) C629 88 310) GO 
053 GO056 60056 GO055 GOOSS GOUIY GLOSS) (686 244 812) GOUS6 BUUS7 GUISY GOUS2 GUD6U GOOSS GOOSO) INIL) 11813 8144) GOO4D 
wO036 FOOS) GOMOD) (i tzhdd (CEL) GO062 GOO44) (NIL) UCN}? (NIL) (NIL) (184 816 835) 60042 GOO] GOD41 GnO49 GoO43 Go0d2 
wO037) (C83 £2 516 817) GO068 GU0K6 LUU62 40063 LOUSd GOU6S 60064 “ODdS G0045)) 

CCNILD (4820 819) GOU46 GU047 GHOSE) (185) GUOSH) (NIL ¢NIL) CNILD CEEY 84 E10) GOUSS GOG56 GUNS4 GoO055 GO053 GOO3y Goo 
3d) 6487 £6 811 212) CO0S2 GO057 GNOSY GOUS2 GOO6U BOGS FOUSG) 11815 314) BUO4U GU036 GOO51 GOO4O) 4¢821)) (NELD tnt) 
CNTED C084 816 825) C0042 COUGH) GUN41 VU04Y LO04S GUO42 GU0S7) (181 FS 82 $18 817) LOUES COOO6 GO0S2 GO063 GO044 GO065 G 

0064 GO0e8 GQ045)) 

LOCAL 

{LOCAL ASSUMES (85) (813 824) SAME BOUT) 

LOCAL 

CONTLD CCEZO 219) CON46 GO047 GNuUaK) (1839 224 $5) GOUT6 GOUSL GUD4D BOLI) (NIL) (C89 £8 °8410) GOO5S GOOSe G0054 GO05S5 G 
0053 GO039 6003S) (87 86 831 292) BOOGS2 GUOS7 FOUS9 GOUSe GOVOU GOUSS BOOSL? (NIL) (68227) (NIL) (084 816 215) GOOd2 GO 
061 CGOO4! GO049 GO0S3 GOO42 GNOS7) (L481 FS Be TLE 817) COU44 COUHSH GUD62 60063 GO0ed CODES GOO0Ed GO0d8 GNO45)) 

LOCAL 

COOL 83 22 818 117) G0044 GO066 GANZ GOUCS GOU44 6000S GO064 6004S GON4S) ((84 816 845) GO042 GOOOl C0041 60049 Gonads 

GO0S2 GO037) ((87 86 B41 212) GNUSPA GOUS7 &OUSY GOO52 GUND 60056 60050) 6189 88 810) 60053 &6O056 GONS4 GODS5 GON53 GoOoS 
9 BUOIH) (6813 814 85) C0036 GOS] GO040 GUNSE) (1920 $19) &OU46 GOU47 GOUd6)) 

LOCAL 

SMe 

RESULTS 

(wOLY 1, 18 81 239 82 118 817) 

(SODY 2, -15 34 836 845) 

Y « 3 36 3 t 

cue a Te ete RESULTS FOR STACK AND STACK* 
(BODY Se 1S #13 814 85) 

(BODY 6, 18 820 819) 

NIL 


Scene ie The concave object :11-15-14-7-6 presents no 


problem, since there are. plenty of visible vertices 
(figure 'L10'), and SEE makes good use of them. 


SINGLEBODY is necessary to join regions :13 and 


The bodies of a scene do not need to be 
prismatic in shape, nor convex. Their vertices could 
have errors in their two-dimensional position. Table 


‘ASSUMPTIONS! (page 255) specifies the suppositions that 
our program obeys. 
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L 10 


FIGURE '‘'L 1 0° 


Singlebody had to join :2 with :13. 
All four bodies were happily identified. 
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Scene “Eid Four bodies are found by our program in R10. 


The scene is a good example of a "noisy" scene, in which edges that 
should be straight look crooked. This is because the coordinates 
of each vertex are "imprecise"; the vertices have some error in 
their coordinates. Other scenes also show this tendency; they 
accurately represent the data analyzed by SEE (the scenes in their 
final form were drawn by program, then inked manually), and should 
not be considered as "sloppy drawing jobs". 


SEE has several ways to cope with these imperfections: 
(1) tolerant definitions of parallelism and colinearity. 


(2) insensitivity of heuristics to displacements of the vertex. 
For instance, vertex V will inhibit the link that Z proposes, 
either when V is of type ‘Arrow’ or when it is of type ‘'T' 


(but not when 'Fork’): 5 Z 
z v s 
Vv 


(3) Large variations in the coordinates of a vertex are possible 
before that vertex changes type. Vertex of type 'T' are an 


exception, changing into a Fork or an Arrow by a small displa- 


a KT 


Nevertheless, it is possible to "straighten" these vertices, 


cement. 


by following the suggestion in the comments to scene R17. 


The section 'On Noisy Input’ deals with these matters. 
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FIGURE 'R1 0° 


The scene conteins "noisy" vértices; Thence, some 
edges look bent. SEE hag resourees to cope with these 
problems. 

Figures L10 and R10 form a stereo pair. In Figure 
'L10 ~ R10’ in page 247, - informetion from both scenes 
is combined to find the position: af these object# in 
three-dimensional: space.- See se¢tios *Stéreo paseep tone 


~ && 
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Sees SD Le sept ee earew aden ce 


esse 
Scone TOWER there is no need to make use of LOCAL or SINGLEBODY 


in this scene, since there are plenty of global (strong) links 
among the different regions.. 218-22 and :17-23 get links thanks 
to the haurisete that analyzes vertex of type "X". 
There are several "false" vertices, formed by coindicences of 
edges and "genuine" vertices: the vertex common to :9, 11, 12 and 13; 
the one common to :2, 4, 5, 6. They do ie cause problem, because 
(1) in the case of the vertex common to :9, 11, 12 and 13, it is of 
type "MULTI', and no link is laid. 

(2) In the case of the vertex shared by regiong :2, 4, 5, and 6, 
it is an "X" that will establish one link between :4 and :5 (which 
is correct), and another between :2 and :6 (which will do no 
harm, since we need two "wrong" or misplaced links to cause a 
recognition mistake). 


Compare with scene 'REWOT'. 
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FIGURE ‘TOW ER' 


A "wrong" link is placed between :2 and :6, 
without serious consequences. Results for 
this scene are in "RESULTS FOR TOWER’. 
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TET 


SEE SQ ANALYZES TOWER 

EVIDENCE 

LOCALE VIDENCE 

TRIANG 

GLOBAL 

CONTL) ¢€0820) GO019 GOOLE GOO17 GOOL6) (4819) COOZ2 GOO2ZO GO01F GO016) (¢221) SGQ22 GOO20 Goo1s Goo17) 4¢(318) GOO2d Goo2 
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GO043 GOOS3 GOOIO) (137) 6OO36 GOOS4 GOOST) 6129) GOO2Z9 GOO27 GOO2]S) ((88) GOD44 GOOIS GOOS4) (685) C0045 GOOI4 GOO1S) ( 
C8460) C0817) CI02Z6 GOOZS) 1(918) COO48 CGOG45 GOQI4) (484) GOU4S GOO45 COOLS GOOLS) (183) GO049 GO047 GOOS9 GOOSS) (1824 
yd €€82)2 60069 60040 GO039 GO037)) . 

CONIL) (NILD CNEL) (¢926 819 821) GOOL7 GOO2ZD GO089 GOD16 SOO2Z2 GOOZO GOOLS GOOL7) (NEIL) (4338 822) GOO21 GO023 Goodall) ( 
NEL) CNTLD (NTL) CNEL) CNELD CNEL) (NIL) (C883 814 822) GOOSD CGOOd2 GOOSZ GOO43 GOCTL COOSO) (NIL) C4210 B11 39) GOO2S & 
0043 GO028 GOoo2? Go0Z7 GOO25) ¢(¢86 #7 £8) 60046 GO03S GO036 GO0SS GOU44 GOOSS GOOS4) INIL) (08916)) G4323 817) GO024 GOO2 
6 GOO24) (NIL) ((325 85 84) COOLS GOO1S GO04S GO04S GOO1S Gooid)d (NELd (4824)) (482 83 81) COOKE GO037 GOG47 GOOI8 CO049 
G0049 60039 60037)) 

LOCAL 

qeNILy CNILy ¢¢820 £19 €22) GOOL7 Gonzo Go019 Go016 Sooz? Gon20 Goo1s GoO17) (4918 822) GudSL CooZdS GOO21) «NEL? (NIL) 
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7) 60024 60026 GOOZa) (186 47 18) GO046 GO03S 60036 GOOSS SOO44 GONSS GOOS4) (6830 $11 89) GU025 GOO4t GoOzs GOC2Z9 60027 
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19 60016 GOO22 Goo20 GOO148 GO0i7)) 

LOCaL 

she 

RESULTS 

(BODY 1. °5S 82 83 833 

(BODY 2, 1S 845 35 34) 

(BODY 3. 1S 823 £17) 

(BODY 4. IS 86 87 38} RESULTS FOR TOWER 

(BODY S. 18 810 211 389) 

(BODY 6, I8 333 834 832) 

(BODY 7. IS 818 222) 

(BODY 6. 18 820 819 823) 

NIL 


Scene REWOT 
Guan This scene (see figure 'REWOT') is the same as the 


scene TOWER (see figure 'TOWER'), but upside down. The program 
obtains identical results for both scenes (see ‘Results for Tower' 
and ‘Results for Rewot'), because SEE does not use information about 
a body supporting or leaning on another body. For instance, it 
was not assumed that body :1-2-3 is partially supporting (in figure 
‘TOWER') body :4-5-15; clearly this assumption fails in case of 
figure 'REWOT'. But since the assumption is not followed, the pro- 
gram succeeds in both cases (gives same results). 

See table 'ASSUMPTIONS'! (page 255) for suppositions that the 
program makes or presumptions that it does not need. 

The regions :16 and 124 had to be marked as part of the 


background, following standard practice (cf. ‘Input Format'). 
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FIGURE 'REWOT' 


This scene is the same as the scene TOWER, 
but with Y replaced by 100. - Y, and 

X replaced by 100. - X : it is upside 
down. SEE still finds eight bodies. 
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SEE 58 ANALYZES REWOT 

EVIDENCE 

LOCALEVIDENCE 

TRIANG 

GLOBAL 

CONTR) 642920) GOL1S4 GO1SS GOLS2 GOL1SL) (1819) GO1S7 GOISS GO1S4 G 
O83) ¢€6823) GOLS7 GOLSS GOLS3 GOLS2) ¢¢816) C0138 CO1I36) (1822) 

GOIS8 GO1S6). ((823) 60141 GOLS9) (6820) GOLS6 GO144 GoO143 GO140) 
((811) CGOLS6 GOL4S GOL42) ¢¢813). GO1S7 60147. GO145) (¢814) 60158 
GO157 GO147 GO146) (¢26) GOL16L GO159 GO1S1 GOLSO GO148) ((82) GOI 
62 60161 60155 G0153 GO1S2) (¢812) GO158 GO146 60145) ((37) GO1S) 
60149 GO148) ¢¢89) GO1L44 GO142 GO140) ((29) GO15S9 6OL5O GO149) { 
(35) GOI60 SOL29 GO128) (¢226)) 66847) C0142 GO1S9) ((815) 6016S 

GOI30 GO129) ((84) GO163 GOL60 GO130 GO128) ((83) GO1L64 GO162 Gol 
$4 GOISS) ((324)) (¢81) 660164 GOSS 60154 60152)) 

CONTLY CNEL) CNEL) (0820 8139 824) GO132 GO‘SS CO1S4 GO131 60137 & 
0135 GOLSS GOLTA) (NIL? (4316 $22) GOLS6 GU158 GO1S6) (NILY (NIL) 
(NIL) (NIL) (NIL) (NIL) (NIL) (¢815 814 5132) COI148 6O157 GO147 G 
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LOCAL 
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148 GO1IS9 GOL5O GO149) (62160) (4823. 847) COLTD COLE GO1SH) (C88 
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LOCAL 

(082 &3 33) 60161 GOIS2 60162 GO153 GE164 GO155 GOI5S4 GOIS2) (4(8 
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> (0320 8112 89) 60140 GO156 GO1L43S BO144 GOL42 GO140) (C813 814 83 

2) 60145 GO15S7 GO147 GO158 60146 GO145) ((338 222) 60136 60158 6O 
$36) ((220 819 821) GO1LS2 GO13S VO1S4 GOLSL GOL3S7 GO135 GOLSS Gol 

32)) 

LOCAL 

SM3 

RESULTS 

{BODY te 16 82 83 81) 

(BODy 2. 18 #35 35 84) 

(BODY 3. 18 23 817, 

(BODY 4, IS 36 87 86) RESULTS FOR REWOT 

(BODY Se 18 10 811 39) 

(BODY Ge IS #13 2814 812) 

(BODY 7. IS 238 822} 

(BODY 8, 18 820 81419 221) 

NOL 
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EL The concave objects are properly identified. W places 


a link between :23 and :4, and another between :30 and :4. CC does 
not inhibit the link between :17 and :19 ordered by the Arrow NA, 
because NOSABO was never “called, since the first rule of 'ARROW' 
(page #4 ) was applied. 

The only mistake was that objects :9-7-6 and :10-5 should be 
fused and reported as only one. There is a link between :9 and ;10 
put by heuristic (g) of table ‘GLOBAL EVIDENCE’. It is not enough. 
There is also a weak link between 'Triangles' :5 and :6.. OB is not 
a 'Leg', so there is no weak link between :10 and :5. The situatifon 
is as follows (see chains of links in 'RESULTS FOR WRIST*; how to 
read these chains is explained in page 1|0 , ‘Explanation of the print- 


:10 and :5 will get joined later by SINGLEBODY. 


Almost the same thing occurs with :1-2-22-21, but in this case 
vertex A produces one strong link between 22 and 21, and vertex R, by 
heuristic (g) of table ‘Global Evidence’, also links 22 with 21. This 


is enough. 
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FIGURE 'WRIST*' 


Instead of one, two bodies were found in :9-7-6 and :10-5 
Insufficiency of links was the offending reason. Al1 other 
ebjects were correctly found. 
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Scenes L2 and R2 


Two objects are found, as expected. 

These scenes form a stereographic pair: two pictures taken from 
the same scene from slightly different locations, mantaining parallel 
the optical axes of the cameras, and the same magnification. A pro- 
gram, not yet completed, is designed with the following ideas: 

Left and right pictures are independently processed by SEE; L2 and 


R2 in this example. The answers are 
ANALYSIS OF L2 ANALYSIS OF R2 
(BODY 1. IS :2 :4) (BODY 1. IS %:1 %:2 %:4) 
(BODY 2. IS :1 :5 :3) (BODY 2. IS %:3 %:6 %:5) 


The question is now: Is body :2-:4 the same body as %:1-%:2-2:4, 
or is it %:3-%:6-%:5 2? It is required, after decomposition of the 
scene into bodies, to match the left bodies with the right bodies. 

If this is accomplished, one could then locate the figure in three 
dimensional space, from the two-dimensional coordinates of the figure 
in the left and right scenes. : er 

In this way it will be known where these objects are located in 
the "real world", | 


This "matching" mentioned above is complicated as follows: 


== It is possible that the number of objects observed in one view 
is different ftom the number in the other. 

== =6: On a given object, it is possible that SEE will make a mistake 
in the left view, but not in the right view; as a consequence, 
two bodies on the left have to be matched with one on the right. 


If the two axes of the camera are on an horizontal plane, a vertex 
in the left scene and its corresponding vertex in the right scene 
(if visible) will have the same y-coordinate, such as H in L2 and 

4I in R2. Other known relations exist, derived from the relative 
position of the axes of the camera, magnification, etc. See section 
‘Stereo Perception', 
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R2 


%l 


FIGURE "Roe 


Two bricks 4re found 
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L 2 


FIGURE 'L 2' 


Even if (possibly) a face of object :4-2 is missing, 
in this case SEE makes the correct identification. 
Section 'On Noisy Input' deals with {fmperfect 
information. 
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Scene Ll? the small triangle :15 just could not get joined with 


the remainder of the body :16-20-19, and two objects were found. 
There is a weak link between :15 and :19, but it did not help since 
there is no link between :15 and :16. What happens is that regions 
:1, :15, :13 and :22 all meet forming a vertex of type MULTI; this 
vertex should (in some future version of SEE) be split into two, sin 
ce both :1 and :37 are the background=- The rule for this splitting 


seems to be 


t11 was joined with :4, but isolated from 312-27-5. There are 
no T-joints between these two nuclei that could give 'hints' (1, 
links) for their unification. 


Cuy 


The two Large concave objects were properly fsolated. 
Compare with R19 and WRIST*, 
See ‘Merged vertices’, page 22/ in section 'On noisy input.’ 
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FIGURE '‘'L 1 9! 


It was easy to find :6-7-8-9, the hexagonal prism. 
:15 was reported as a gingle object: a mistake. The two big 
concave objects were appropiately identified. 
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Scene R19 
pedgeattedben AS 


As in L19, here the triangle :27 is detached from 
35-32-33, two bodies being reported. There is no strong link between 
327 and :33. There is a weak link between :27 and :5, because both 
are 'triangles' facing each other, but that is not enough. A weak 
link is never enough. 

All other bodies are properly found, including :10-16-2-3. 

Vertex RA, of course, contributes with no links. The situation 
could change if we discover that RA is a false vertex, ection 
that is, one composed by the merge of two genuine ones. 
There is enough enformation, I think» Since 1:34 and 1:37 are bakgound, 
and this will suggest a way to "divide" vertex RA into two simpler 
ones. This idea of dividing vertices of type MULTI into simpler 
ones should be applied with caution, since there will be genuine 
vertex of type MULTI (which should not be split). The main use of 
this technique will be for helping single regions to join some other 
body, a task performed now, not too satisfactorially, by SMB. 


Compare with L19 and WRIST*. 
See ‘merged vertices’, page221. 
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seene: CORD The pyramid :8-9-10 was easily identified because a vertex 


of type PEAK produces many links. In the bottom, bodies :1-2-3-4 and 
:12-13-11 were separated, because the fork between :4 and :12 has the 
background as a region, and did not contribute with any links. Cer- 
tainly, this is a pssible interpretation. Another interpretation is 
to regard the object :1-2-3-4-11-12-13 as a prism with the shape 


of a "C", 


SINGLEBODY was needed to join :4 with :2-3-1, the only link 
being placed by heuristic (g) of table "GLOBAL EVIDENCE.' 

The program knows that :22 is the background. 

If we could see the hidden vertex KK (if it indeed exists), 


two links would be put and we will have had one body: 
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FIGURE ‘CORN’ 


The pyramid at the top was identified 
properly. Two bodies were found at 
the bottom, which is a plausible 
interpretation: :1=2=384 and :11#12=13. 
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Scene. 12 Here the tolerances SINTO and COLTO that allow for 


"sloppy parallelism" have made T's out of NA and PA. Therefore, 

these vertices do not contribute any links for :1. Moreover, the 

™~' PA inhibits the link suggested by QA between :1 and ;:8. 

That being all, 1:1 gets reported as a single body (see next page). 
By decreasing the tolerances, correct identification is possible 


(see the correct identification in page 155). 
See 'Tolerances in collinearity and parallelism’, page 215 . 
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‘SST 


FIGURE 
Four bodies are identified. 


'Z, g'' 
Body :1-8-9-7-5-6 gives some problems. 
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SEE 58 ANALYZES L9 

EVIDENCE See also next page. 
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Scenes R9 and R9T Four bodies are found inR9, five in R9T. The 
difference is that Y and JA (see figure at bottom of this page) are not 


"matching T's"in R9T. The strong links among :12, :3, :10, and :16 are: 


LINKS FOR R 9 LINKS FOR RYT 


In R9, the two strong links (GO030 and G0021) between :12 and :10 
were put by the matching T's Z-EA and Y-JA; of the two strong links 
between :10 and :16, one was because DA is an arrow; the other, 
because EA is a "T" for which heuristic (g) of table 'GLOBAL EVIDENCE' 
applies. 

But in scene R9T, not having Y and JA as matching T's, a link 
between :10 and :12 dis appears; and also nuclei :16 and :10 can 
not be linked by heuristic (g) of table ‘GLOBAL EVIDENCE’. SEE deci- 
des to report two bodies there: :3-12 and :16-10 instead of one 


as in scene R9. 
BA 


Are Y and JA matching 
T's or not? Different 
answers produce different 
analyses of the scene. 


These scenes show that the analyses can be quite sensitive to 


the "right" definition of parallelism and colinearity. 
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SEE 58 ANALYZES RO 

EVIDENCE 

LOCALEVIGENCE 

TRIANG 

GLCeAL 

CONTRD €6%820) 60018 GOOL7 60016 GOO14 COOLS GOO1Z) ((x819) GOOZO GO0I9 GOO1S GOO16 GO015) ((x213) GOO23 GOO22 Goo2D1 «1 
4215) GO026 GOO25 GOG23 GoO2s?) (14910) 60051 GOOSD GOO26 GO021) (14316) GO0S1 GOD29 GOO26)? <(xt8) GOO27 GOO19 GoO1S Gods 
4 GOO1Z) (14817) GOOT2 GOOZ7 GONZ6 GOUZS GOOZE) ({%87)} GOOTZ GOC24 GOO17 VOOLS) (4429) GOOSE GOOTS COOSS) (14314) GO036 
GOOSS GOO3S?! ((486) GOO36 GOO3S7) ¢¢%818) GOOS7) ((X85) COOIS COOSA) 114811) COOKS GOD42 GOOSI GOOSG) (6X81) GOD4I GODA2 
GUO38) ((%812) GOO4S GOOIU GODZL) ((%433) CONS COUZG) (1%82) GOU46 60044 6O04S [UDG GODT) (1X34) 50046 GOO4t GOO4O) (1 
BB21000 

CONTE) ONTL) (NTL) GNTLD QNTLD CNEL) CNT) ONTO) ONELD (08820 2ELG X38 ZE13 4915 4847 297) GOO13 GOOLe GOO016 GOO19 GoO1S 
6U014 bOOL2 GO020 GO023 GOOZ2 GOUZ7 60026 GO025 GO032 GOU24 GOOL7Z GOOLS) (NEL) (NEIL? (6%26) 60038 GOG37) (1%818) 60037) 
(299 2824 285) GOGIS GOOSS GOGSS GOOS4) (NIL) CNIL) (NIL) C4840 KaLO 4312 %33) GOO3] GOO2Z8 GOON 60021. GO045 GOO29) 1 

NIL) C0482 KELL 282 K84) COOSS COOS42 6[O046 GO04d GO04S SOLID GOU46 F004 GO040) (1x821))) 

LOCAL 

(LOCAL ASSUMES (x86) (224 %812 282 234) SAME BODY) 

CONTO) CNILD (NSO) (NIL) CONTE) 005920 2EL9 258 R813 KE15 8817 427) COOLS GouULs Go01S Good GOO1S Gool4 Goo12 Goa20 GoD023s 
GOOZ2 GODZ7 GOO2Z6 GOO2ZS GOO32 GOO24 GOUL7Z GUOLT) (NIL) (OKEL R811 282 KE4 436) FO042 CO044 CO043 GOOID GOO46 Good GODS 
O |OOSS GOOS7) (14216) GCOO37) CIdtg KEA4 KE5) COGISG COOIT GOOSS CONS) CNIL 66%810 4816 X812 %83) GOO31 SOO2Z—a GO030 COO 
21 G0045 GOO2Z9) (NIL) FOX821092 

LOCAL ; 

(SINGLEGODY ASSUMES (X83 2844 X82 224 X86) (X46) SAME BOLY? 

(C(B310 X316 2842 X83) BOOS) Gon26 GoosO Gunz! GoO4S Goo2y) CCSD X814 ZED) COOIE GOOSS GooSS &OOS4) INTL) (CANS ZEGE 23 

2 X84 486 2818) GO042 GO044 GO043 GO03D GOU46 GOD41 GO040 GOOSS GOCS7} (12820 4819-486 X3L9 X815 NIL7 X87) GOO1S GOOLE & 

0016 GOO19 SO015 GO014 C0012 GOO20 GO023 GOOZeZ 6O027 GO026 GOO25 GOOT2 GO024 60017 GO013)) 

LOCAL 

SM& 

RESULTS 

(BODY 1. 5S 2810 28160 K812 %83) 

(BOLY 2. 1S %89 K814 K55) 

(BODY 3, 1S BEL RESA KEQ X84 X86 XBL8) RESULTS FOR R9 

(BODY 4. IS R820 KE19 48H KIA3 4835 2317 487) 

Niu 


FIGURE 'R 9° 


The four bodies were found. 
SINGLEBODIES was needed to join :18 
with :6-11-1-4-2, 
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ROT 


FIGURE 'R 9 T' 
SINGLEBODIES joins :18 - with ‘the 
other portion of that: body; LOCAL 
is needed to join :6 to that": 
portion, and it6 with’:10. 

_. Nevertheless, since :12 and :10 were. 
-not found tobe the same”face, body 
316-10 is found, and: Body :12-3. : 
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so This scene has been analyzed in great detail in the 


section that describes the program SEE. Its links are found in 


graphic form in figure ‘TRIAL - LINKS", or in written form (lists) 
in "RESULTS FOR TRIAL". 


LOCAL had to join :13 with the remainder of that body, 


AA 1968 
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SEE 58 ANALYZES TRIAL 

EVIDENCE 

LOCALEVI DENCE 

TRIANG 

GLOBAL 

LENEL) C849) GOOLA COO1T GOOIL GON1O) (4812) LOOLS GO014 GOO1S wOG1Z) (C813) GOOVL) C189) GOOZ2 GOOZ1 Gooz20 Goose Goo17 
S0016) (C230) GODIS GOGI2 GOCLL GoOlU) (483) COOSA GOOZS GUCZ4) (484) GOUSS GOVS2 GOUZE GOOZS GO0ZT) (180) GOOS1 GD30 

GOCZ9 GUGZ7Z) ( (85) GOOZ GOO2T GOOPZ wO01H GUDI7) (687) GUOS3 BOOSZ BOUIF GOU18 GOODIE) (156) GGO34 GON GOZO) (132) GO 


035 GUOSE GLOOZD GUDZ6) (E14)) C681) CGOOSS GOOIO GLOL28 GOuU27)) 


CONIL) (NIG) (NEL) CO843) GOOZL) INTL) (C843 $12 BLU) GOOLE GUOIS GooL4s GU01S GoOO1S GO012 Gov1! GOODIN) (MILD CNILD UNTIL) 
(NIL) (NIL) (0846 39 25 87 83 BA) GOOZD Go) 60016 GOOZE GUOZS GOOZ2 GOD17 60033 Goose GOO19 GO018 GOO16 60025 GOOS4 wo 

024 GONZO) (NIL? (¢8194)) (ESO 82 21) GOD2Z7 GuOSI GOUZD GOOIS BOUSO C0026 60027)) : 

LOCAL 

(LUCAL ASSUMES (313) (84 89 85 87 83 #8) SAME BODY} 

Local : 

CONTE) EMILY ( (84 29 85 87 23 48 3493) COOZ1 FOUZ6 GUUZT GUDZ2 BUOL? GODIS GOO52 60019 60018 GOO16 Goo25 GOO34 GOO24 Goo2 

O GOOZL) ((f11 £42 £20) GOUEL GOUIS GOUI4 BONIS GUOIS COOLS GUOLL GooLO) CAIL) CNL) CNILD CORD4DD C086 82 41) 60027 Goo 

31 60029 GOO35 GOOSO0 GOU2Z6 GO27)) a 

Loca 

(4¢36 &2 £1) COO2Z7 GOO31 GON2D GOO3S &O0SD GUC2ZS GOL27) Citll $12 810) GOUL2 6001S SOUI4 GODLS GODIS GOO12 GOOLL GoOtC) 

(184 89 85 87 83 88 813) GCODZL GEO26 GOO2S GOOZ2 COD17 GOOSS GOUSZ &O019 6O018 GOOLE GOOZS GOOS4 GOO24 GO020 GO021)) 

Local 

Shp 

RESULTS 

caOvY 1, 18 16 82 81) 

(BOLY 2. 1S 8114 #12 810) 

(MOLDY 3. 1S 84 89 85 87 483 38 813) RESULTS FOR TRIAL 


NIL 


Scene Skt sky analyzes scene ARCH (see figure 'ARCH') with results 


displayed in 'RESULTS FOR ARCH'. This is an scene composed of many 
degenerate views of objects. It is an ambiguous scene (see section 
on Optical Illusions), in that several good interpretations are po- 
ssible. 

The program reports :7 and :17 as one body, which could be plau 
sible. :16, :9 and :10 get reported as independent objects. In 
the scene from where this picture or line drawing was taken, :7, :17 
and :16 were the vertical face of an object. :10 was the vertical 
face of another, :9 being its horizontal (top) face. In cases like 
this, in order to choose the "right" one of several possible inter- 
pretations, more information has to be supplied to the program, such 
as lighting, textures, color, etc. 

No link was put by A between :3 and :29, or by UB between :5 and 
2:19, because D and W are GOODTs. In one case, G provides with more 
links and causes :3-8-29-31 to be reported as one body, which is 
correct; in the other case, Q can not supply any links, and that 
body is split in two: :5-4 and :19-18. This is a mistake of GOODT, 
who accepts W as a genuine T. If this were not the case, the arrow UB 
would establish a link between :5 and ;:19, avoiding the mistake. GOODT 
could stand some improvement. 

The body :22-23 was identified correctly. 
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ARCH 


FIGURE "A RC H" 


Ambiguous scene that could be correctly interpreted in 
several different manners. :7-17 was reported as a single 
body (see table 'RESULTS FOR ARCH’), and also :9. 

The body :5-4-19-18 was split in two: :5-4 and :19-18, 
but not :3-8-29-31, which was counted as one body. 
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Set 58 ANALYZES ARCH 
EvsVENCE 
LOCALEVIDENCE 


_ TRIANG 


GLOSAL 

CONILD CO8S) CODZ2) (C38) 6002S bUNZS GOO2Z2) (1851) GOOZ9 GoOZS GOO24 GOO2]3) ((94) GOO32) 44326) G00Te GODS? GOO35 bonds 
) (6824) BOU3Y¥Y GUOSH GOOTS GODIZ) (6825) GUOSS BOUTS GOUI7 GOOIO? £1925) COCKO) (48179 GUOSS) (fh13y GO045 GOO44 GOOd2 G 

O01) (£842) CONG GLOMS BODES GUU42) CU ELLD) CERSODD 668146) GOUS6 GU04S CODST BOOS) CE89Dd 618269) COBESDD CHSLOID COB 

33) GO050 60049 GUV47 GUOIO) (C104) GUD5O 6O044 GONE7) (1832) GOO49 GOU48 GOCSO) (4395) GONS]> 14219) GOO31) (487) GOO3S) 
COTOND CLEZOID COHQSI COOAOD CEEZADE CO8S71) 6082890 C082) COOS2 GO0S1 GoO27 GODZ6) (18k) 60053 GOOS2 GOC28 GOO27) (483 

0) 60053 w005S1 BUUZS GO026) (1329) Gou29 GuUnZ4) 6C835)) £48186) GOUSL)) 

CONTLD C093) COZZI) CNILY (NIL) (C84) COOS2) CNILD (NIL) (6226 824 825) CU037 60039 GO035 G0034 GOOS9 69658 GON37 Goose) 
(6825) COU40) (4217) GOOSTD) CNTLD CNILD CCRLEDD. €C856)) CCB1S 812 814) 50042 GO046 GODK4 GOUS2 GO046 GOO4S GON4d Gooal) 
C4B9DD COESGI) CORES) C48100) ENTLY CNELD CO833 854 832) COOSL GOLUSO Guus? GOU4D 60048 GOOID) (68S! GOO32) (6219) GOOS 

1) C087) VOOSS) C686)) COEZOD) CEEAZD GONGUD COEZRID COEZZIY C6828) ENTLY CNILD (6622 81 830) 60026 GO052 GOO27 Guod3s GU 

O51 GOOZE &OO26) 1¢38 831 329) KuOZ2 GUOZS GuO23 GCOOC29 GOOZe) (4255)) (4218) GOOS1)?) 

LOCAL 

(LOCAL ASSUMES (8149) (216) SAME BODY) . : 

(QUCAL ASSUMES (23) 186 ES] 229) SAME BODY? 

LoCaL : : : 

CONDE €¢58 SIL 829 $5) CGOVIS CGOUZ]T GOO!Y]9 GOUes GuOZ!) (NIL) (684) GOOS2) (NIL) (6826-824 825) 60037 GO039 6003S Ghose G 

0039 [FOOSE COOSA GOOSG) 11823) GuOdOs (6817) GOOSS) CMTLD (C8149) ChESODD CORRS B12 $44) CO0KZ GODKO GO0es GO0d2 GOODE G 

0U4S [CO0SS COG4SLD CCEIDD CCETODD COELSD) CCBLOFD CNALD (04339 234 BIZ) CUNSO 60050 GO047 GO049 GOD4s GOOIN) (185) 60032) 

(0818 319) COOSL? 66279 GOOSID C1EHID CHEZOHD-CCKS]S]) COU) COSSEPD (48270) 10828)) CNILD (OBZ 81 830) COL26 GOLS5S2 wOo27 
e0653 GIO51 GOVZ8 GOOZ6) CNIL? (623550) (NIL? 

LOCAL 

(SINGLEBULY ASSUMES (825) 12922) SAME SUdY) ; . 

CSINGLEBOLY ASSUMES (837) (37) SAME BUDY) - 

(SINGLEGOLY aSSUMES (84) (85) Same BOY) i ; 

(0082 84 330) FO020 C0052 &GO027 60053 GO051 6OOZB GOOZH) CIEZEdY 66827dd CLESLID ENTLY (C8209) (HELD C0816 8199 LOOSE) ¢ 
NIL) (€(833 834-8324 60050 FO0SO 60047 GOO49 VOSS GOOIO) (68142) CCBISDD CAERLODD COBGPD COB13 812 814) GOO42 60046 GO0ss 
@LU42 [0046 60045 GOOSS GOO41) 148319) (4817 87) GOOTSD ((823 822) COO4O) (14920 82@ £25) GOUI7T GOUSE 60035 GO034 GUDI9 

GU0Sd GO0S7 GONTE) (184 85) CNUS2) (18H FIL 829 83) GOLVZS BOOZS GOUZY COO24 -GODZ2)) 

LOCAL ‘ 

SMB 

RESULTS 

(THE FIRST 9,-DODIES ARE ((828)) C68270) C18SIDd CE8209) CEELODD COBLSDD CEBLODD COBGDD COBILIDD 

(BODY 10. 1S 82 83 830) 

(AGLY 343, 18 816 339) 

(pQUY 12. 18 83S 834 832) 

(Hg0Y 13. 18 813.812 214} 

(gODY 14, IS B17 87) : 

(eQuY 15. 18 #823 &22) RESULTS FOR ARCH 

(g0DY 16. IS 826 t24 829) 2 

(sOUY 17¢ IS 84 85) 

(dOOY 16. 15 88-831 829 83) 

Niu 


Scene BARD This scene consists of objects of the same shape, namely 


triangular prisms. All are correctly identified » including the Long 
and twice occluded 33-21-22-23-24-28-29. :1-2-33 was also found. 
LOCAL had to be used to join 3:15 with :16, and also :11 with :12. 

In an older version of the program, 17 was identified as a sin- 
gle body, and :6 as another, because they have no visible "useful" 
vertices to place links {Guzman PISA 68}. Now SEE joins :6 and :7, 
because both are "GOODPALs". See "Operation of the Program; SMB" (page 
99). | 

These scenes are sometimes obtained from a picture, so that 
they are the result of a perspective transformation. Some other 
scenes are drawn more or less in an orthogonal or isometric projection. 
SEE does not depend heavily ‘in the type of projection; there are only 
a few heuristics that use notions of parallelism, . 
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HARD 


FIGURE ‘HARD! 


All the bodies were correctly found. 
The most difficult was :6-7, since SMB 
had to join both regions, which do 
not have "useful" visible vertices. 
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SEE 56 ANALYZES HARD 

EVIDENCE 

LOCALEVIDENCE 

TRIANG 

GLOBAL 

CONTLD €68349) C09GDD (0836D) 66826) COO26 GOOZS COOZS GOO21 GOOZO) (¢823) 50026 GOO27 GoO24 GOO22 Goo21) (1813) Goods C 
0044 6OG43 GOD42) ((817) GOO47 GO0dG GOOSS GO04S) (1979) (1822) GO049 60041 60040 GOCZ9 Goo26 GO0ZS) (483) GO050 SOad0 & 
0041 GO039) (1821) GOOSO 60040 GOOS9 GO0ZD GOO2S 60027) (181) GOOSZ GOOS1) (682) GOOS2 GOOS1 GOO18 GOO17) (1825) GOOS3 ¢ 
0038 GO036 60019) 11926) GSOQ084 GOOS3 GO037 GO0I6) (1827) GOOS4 GO0S8 GO037 GOO19) (1928) GOO5S GO024 GO022 GOOIS) (¢829) 
GOOSS GOOZS GO02G Go015) (6832) 60057 GOOSE 60034 GOOS3) (¢833) GOOLE GOOL7) (6832) COOS7 GOO35 SOO34 GOO16) ((25) GOS 
8 60048) (184) GO05S 60048) (1810) GOCSS GOOS2 COOSS) (4214) GOO47 GOO45 GOO43 GO042) (1818) GOO 60061 GOGO GOOts) (1 
$19) 60064 60063 [0062 GO061) (1820) GO064 C0062 GO0GO GO014) (189) GO0GS GOOS2 GOOTO) 1188) 60065 60059 60031 GOOSO) (1 
830) 60056 GOOSS GOOSS 60016) (6915) FO066) £6916) GO0GG) (48552) C0813) GOGO7) (1812) CDO67)) 

CONTLD (C09569) 60969). COFTEDD CNIED CNTY CNEL) CNEL DE CC87DD ENTE ANTLD CNTLD ENILD CNEL CNILD CNEL) (1825 826 827) GO 
OL9 GOOSS GO036 Go0S4 GOOSH GOOS7 GOOI9) (NIL) (4824 822 83 823 821 828 829) GO0Z0 GO026 GOO2S GO049 6004) 60021 GoOOSO G 
0040 60039 GOO29 GO028 GOD27 GON24 GO02Z2 GONSS GO0ZS GOOZH GOOIS? INTL) (181 $2 333) GOOS!? GOOS1 GOO1? GOO18 GOOL7) (NIL 
> (NIL) CCBS 04) GO048 COOS8 GONE) (NIL) C813 817 8141-60063 60067 GO0d6 GO044 60047 C0045 GOO43 GOO42) (NIL) (NIL) (1 
918 319 220) GO060 G0064 GO06S GO062 GO0K4 GO062 GO06O GO014) (NIL? (199 8210 88) GOO32 GO0S2 GO06S GO0K9 60031 GodT0) 1 
832 631 830) 6OC3S GO0S7 GOO34 GHOSH COOTS GOOITD FOOLS) (4825) GOOKG) (1816) GOUEG) CASTS)S (C8i2} COOG7) C1812) 600679) 


LOCAL 

{LOCAL ASSUMES (F112) (#12) SaME BODY) 

(LOCAL ASSUMES (815) (%16) SAME BODY) 

CONILD COR3AIL 60860) CO8TIODD CNEL CNILD CO970) CNTELD CNELD QNILD (0825 326 327) GO019 GOOSS GONI6 GOOS4 GO038 GOO37 CO 
O19) (4824 822 83 823 321° 926 $29) GO0Z20 GUOZE GOO2Z6 Go04? GO041 GO0Z1 GO0SY GHOsO GO03Se GOO29 GO0ZS Goce7 GO024 GOoe2 G 
0055 S023 Gooz0 Ga015) t(82 82 833) GOOS2 GOOSi GO017 GOG1E GOOK7) (NILD €185S 84) FO0K8 GOOSS GO048) ((313 237 814) GOO 


-43 GOO4T GOG46 GOD44 G04? GOd4S GOSS BOO42) (NIL) (1818 319 520) GOD6D 60064 BODES 60061 G0O64 GOO62 GOObO GO014) 1129 


$10 881 GOO32 GOOS2 GO06S GO0S9 GOOS1 SOO3G) (832 32 830) SOOTS GONE7 GOOT4 GOOS6 GOO3S GOOIS SOO1G) (1816 BES) 60066 
d ENTLY CCSTSND CC8E2 S402 COOG7) CNTLD? 
LOCAL 
CECELD SLL) COOG7) C816 228) COOGEE) (6332 831 830) COOTS GOOS? GoOT4 GOOS6 GOO3S COOIS Co0LE) (1 (89 #10 38) GOO32 CoO3S2 
GO06S SOO COOSA GODS) (1818 £19 820) GOU6O 60064 GOD6S GO06) 60064 GOOG2 GOOEO Goole? (¢813 817 844) CODES GOO47 COO4 
6 GOOd4 GO047 GOO4S GO04T GO042) (485 84) 6OOKS COOSS GOOKS) (138 F2 993) GOUSZ GOO81 Goo17 GOO18 GOOL7) (fF 82e 822 83 #2 
3 22% 828 829) GOO20 GO026 Goc2s CoO GO04! GO021 GOOSO GO0e0 GO039 Goo29 C0028 Gooz7 Gooz4 GO022 G0OSS Go023 GOD20 COO 
SSr €6828 €26 €27) GOOLD GOOS3I GOOIG GOOS4 GOOIS GOOS? GooAd? (1570) 618609) 
LeCaL 
$6 
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cpobY 2. 18 is 635) 
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poche 26 The body :10-9 was reported isolated from :13-2-3, 


due to insufficiency of links. See comments to figure R17, also. 
The algorithm that localizes matching T's could stand improvement. 
It sometimes produces "bad links" such as between :4 and :13, and 
between :6 and :3, because it found two T's that looked like they 
were matching (this mistake did not happen, actually, because vertex 
R is not a T, but a fork!), EA and R in this case. The suggestion 


in page 173 will lessen, but not suppress, these "mistakes". 
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woe 


Body :2-3-13 was reported: 4 
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FEGURE | ‘4-4, 


Gt 


SEE 58 ANALYZES Lé 
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LOCALEVIDENCE 

TRIANG 
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LOCAL 

sme 

RESULTS 

(BODY 1. 18 84 35 86) 

(BODY 2, IS bis 812 87 88 84) RESULTS FOR L4 


(aODY 3. 18 £10 39) 
(BODY 4. 18 813 #3 82) 
NIL 


Scene “Ri The table ‘RESULTS FOR R4' shows what happens when the 


tolerances are too large. Five bodies are found. Vertex B is 
considered to be a "T'"', and inhibits the links suggested by the Arrows 
Rand A. As a result, :1 gets cut off :7-9-5-10. 

The way :2 gets isolated is as follows: T and AA claim to be 
matching T's, the link suggested by U is inhibited by Z (a Corner), 


and :2 gets disconnected from :3-4. 


The correct solution is obtained after reducing the values of 
COLTO and SINTO to 0.05 and 0.005 (see listings; COLTO decides if two 
lines are colinear, SINTO if they are parallel), respectively. The 
results appear also in 'RESULTS FOR R4', and we can see now that only 


three bodies (the correct ones) are identified. 


Suggestion Lines like the one below should be 


"straightened" either by SEE or (better) by the preprocessor; for 
example, BK LN and DGHO in figure R17. See section 'On Noisy 
Input'. 


d 
Conservatism aud Tolerance More strict tolerances do not make the 


program more conservative in all cases: the link in (a) fails to be 
placed if the program has too loose (large) tolerances, because A 
will be transformed into a "T" (it will be considered to be a "T"), 
lossing the link; the link in (b) fails to be laid if the tolerances 


are too strict, because the T-joints will not be colinear. 


a b 


In (a), links disappear if tolerances are 
too big; in (b), if they are too small. 
In both cases, conservative behavior (cf. 
page 212) appears. 
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FIGURE: ’R:4" 5. = 
Either three: 6r° ive: bedies are’ foul} ESERIES values of: 
certain parametera. These scenes are "noisy" in the sense that 
the coordinates of the vertices depert from their ai" position 
by as moch-ag ont~ ter, or about 1 % of the total size of 
the image, yhich ch ta abowrose decimeter. This a not large 
enough to affect lopg lipes;but_it may subs fis the | 
direction of short::segments. Ks 
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Scene HOMO the long body 129-30-34-20-19 gets identified as follows: 


129 and 3:30 get two links, and :30 with :19 also, so we have the 
nucleus :29=30-19. Two links (because of matching T's) join :34 with 
120, to form nucleus :34-20. Regions :30 and :34 receive a strong 
link, by heuristic (g) of table 'GLOBAL EVIDENCE', and :19 with :20 
by the same reason. That completes the body. 


The fork that is common to :12, 13 and 14 puts a link between 
312 and :13, but it is not enough to cause mis-recognition. A link 
is put by that same Fork between :13 and :14, as it should be, but 
the Link between :12 and :14 is inhibited by NOSABO. 


There is a program that finds regions of a scene belonging to 
the background, when not indicated as such in the input. For MOMO, 
the results of this program appear in page 2%! . 
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co Region :10 gets a strong and a weak link with :4, and that 


is enough to join them. The same is true for :7. 

The links of scene BRIDGE (see ‘RESULTS FOR BRIDGE') are discussed 
and displayed in pages 95-98 , figures 'LINKS-BRIDGE' (page 95 ), 
"NUCLEI-BRIDGE' (page36 ), 'NEW-NUCLEI-BRIDGE' (page 97), and 'FINAL- 
BRIDGE' (page98 ). 

Because RA and SA are matching T's, two wrong links are placed: 
one between :22 and :28, and the other between :21 and :29. This is 
not enough to cause an error, because wé need two mistakes (two rein- 
forcing each other), two wrong strong links, to fool the program. But 
that could happen. 

It is interesting to note the way in which the long “horizontal 
table" :25-24-21-27-9-12 was put together. To this effect, see figures 
‘LINKS-BRIDGE' and 'NUCLEI-BRIDGE'. 

Vertex JB produces only one link between :5 and :8. Vertex KB inx 
hibits the link (through NOSABO) between :8 and :9, and the link between 
:5 and :9 gets inhibited by S, because it is a T (cf. NOSABO, page 82). 

The concave object :7-6-5-4-8-10-11 gets properly identified. 

We may say that, in general, the more "crooked" or complicated an object 
is, the easier will be for SEE to isolate it, because there will be 
many vertices contributing with valuable Links. 

No mistake was made by SEE on BRIDGE; its eight bodies were co» 
rrectly identified (see 'RESULTS FOR BRIDGE', page (8! ). 

The background of 'BRIDGE' was also correctly isolated; see that 


in page 2,30, section 'On background discrimination by computer’. 
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DISCUSSION 


We have described a program that analyzes a three-di- 
mensional seene (presented in the form of a line draw- 
ing) and splits it into “objects” on the basis of pure 
form. If we consider a scene as a set of regions (sur- 
faces), then SEE partitions the set into appropriate sub- 
sets, each subset forming a three-dimensional body or 
object. 


The performance of SEE shows to us that it is possible 


to separate a scene into the objects forming tt, without need- 
tng to know in detail these objects; SEE does not need 
to know the ‘definitions’ or descriptions of a pyramid, or 
& pentagonal prism, in order to isolate these objecta in a 
scene containing them, even in the case where they are 
partially occluded. 

The basic idea behind SEE is to make global use of in- 
formation collected locally at each vertex: this informa- 
tion is noisy and SEE has ways to combine many dif- 
ferent kinds of unreliable evidence to make fairly re- 
liable global judgments. 

The essentials are: 


(1) Representation as vertices (with coordinates), 
lines and regions 
(2) Types of vertices. 


(3) Concepts of links (strong and weak), nuclei and 


rules for forming them. 


iid vrjonae harmo alleen ech 
sented in symbolic form. 


Since SEE requires two strong evidences to join two 
nuclei, it appears that ite judgments will lie in the 
‘safe’ side, that is, SEE will almost never join two re- 
gions that belong to different bodies. From the analysis 
of scenes shown above, its errors are almost always of 
the same type: regions that should be joined are left 
separated. We could say that:SEE behaves “‘conserv- 
atively,” especially in the presence of ambiguities. 

Divisions of the evidence into two types, strong and 
weak, results in a good compromise. The weak evidence 
is considered to favor linking the regions, but this evi- 
dence is used only to reinforee evidence from more re- 
liable clues. Indeed, the weak links that give extra 
weight to nearly parallel lines are a concession to ob- 
ject-recognition, in the sense of letting the analysis sys- 
tem exploit the fact that rectangular objects are com- 
mon enough in the real world to warrant special atten- 

. tion. 
Most of the ideas in SEE will work on curves too. 
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Te oho k TE an See RES re srg Py es 


CURVED OBJECTS 


How to extend SEE to work with objects possessing curved surfaces. 


Introduction and Summary wose of the heuristics that establish links 


at each vertex are unconcerned if the edges are curved or straight; a 
few heuristics get affected: those that use the concepts of collinea- 
rity and parallelism. 


Thus, it is necessary to redefine and broaden these concepts. 


1. A slight generalization is obtained if each segment is represented 
as having two slopes (initial and final). The functions PARALLEL and 
COLINEAR of SEE are already modified for this (cf. listings). 


SEE does not care if the line joining two vertices 
is a straight or curved line. The information 
about the segment A-B that is relevant to SEE is: 
(a) There is a line between vertex A and vertex B. 
(b) The coordinates. of A and B. 

(c) The segment A~B separates region :1 from :2. 


2. Attempts to take limited account of the shape of the segment carry 
us to 
(a) gently bent segments (definition) are those with bounded slope 
[Bounded curvature will lead to another definition]. 
A quasi-rectilinear object has faces, vertices and gently 
bent edges or segments; it is expected that SEE (Geoaearica] work 
well for them. We should try some scenes. [scoczstron} 


AEN 


a, b: gently bent segments. c: non-gently bent 
segment. A gently bent segment has a slope that 
at any point of the segment does not differ more 
than epsilon from the mean slope of the segment. 
All slopes fall in an interval around the mean 
slope. Gently bent segments form quasi-rectilinear 
objects. 
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Quasi~rectilinear objects. It is expected 
that SEE will work well for them. 


(b) partition of a non-gently bent segment into several gently 


bent. Many of the bodies have vertices and curved edges, 
but the bodies are not quasi-rectilinear (a piece of chewed 
gum, leaves of a tree). By breaking the edges into gently 
bent sub-segments, they become quasi~rectilinear bodies. 
The breaks will occur in points where the curvature is large. 
There has to be devised away to break a segment in a unique 
manner. To avoid breaking a body into two by the introduc~ 
tion of these artificial vertices, we propose to introduce 
also artificial links between regions, to account for the 
artificial vertex. 

™ 


The non-gently bent segment ab 
a 1 gets broken into gently bent seg-~ 
b ments ak, kl, lm, mb, by. the 
artificial introduction of "new" 
« vertices k, 1, m. 


Here, the introduction of 
additional vertices has to ° 


be accompanied by ‘artifi- —_ oy : 
cial’ or reinforcing links, a ‘ 
to preserve the individua- i? 


lity of the body (of the 
owner of such vertices). 


3. More complete consideration of the shape of the segments is obtai- 
ned as follows: 
(a) For parallelism, by requiring that two segments be parallel 
only if one is a translation of the other. Generally, this 
is a comparison that takes a time proportional to the length 
of the segment. Chain encoding {Freeman} {Conrad} is suggested. 
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(b) For colinearity, by discovering properties or features that 
“carry through" or are common. Among these are: 


1. Mathematical "regularity" of the segments. Both segments 
are described by the same or similar polynomials, etc. 


2. Heuristic properties: there must exist properties which 


will select with high probability the "right" continua- 
tion. 


3. Outside of the set of geometric properties, we have 
color, texture, etc. 


c 


4 VS 


| the same line dissappears at b and appears 
at c, making b and c "matching Ts", but to 
discover this fact it is necessary to have a 
concept of "good continuation" or "good con~ 
tour". 


b d 
d 


Alternatively, we may forget these properties here and include 
them into models of our curved objects, but then we are for- 
ced to make searchs in our scene Like those made by DI or TD 


{my M.S. Thesis}. 


Fig. 'SUITCASES'! 


Heuristic properties of segments (yet to be. 
determined) could select a "correct" match 
for endings a, b, ..., k,1. 
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4. Bodies with no edges and vertices are in principle easily identi- . 


fied by SEE. See fig. ‘FRUIT’. 


At some point, we have to know what we want 
SR aE EE LE IE ET LE LET 


The bodies have no curved edges, and no vertices. The entire 
surface is smooth; no sharp edges or pointy corners. Examples: 
an inflated balloon, a frankfurt, a face, a cloud. 

It is doubtful that we could do something here with SEE. We 
could try to postulate "artificial" vertices, using stereo perhaps, 
at the points where the.3-dim curvature is lerge, and then postu- 
late lines between such vertices. This looks bad. 

Or we could reason as follows: since these objects do not 
have vertices or edges, then the only vertices appearing in the 

ecene must geparate two hodies. They will be mainly T-joints. 
(cf fetxlse page 46). 

In principle, separation into bodies looks promising, but 
recognition (the answer to "what is the name of this object? ") 
seems difficult. Nevertheless, it is not clear that with such a 
simple set of heuristics we could work successfully with objects 
as complicated as a human face, a blob of falling water, an 
amoeba, the surface of the sea (?). 


- As the complexity 


increases, the concept of "body" depends less and less in geometrical 


properties (disposition of edges, vertices, ...) and more and more 


on purpose (Is a skeleton an object? Or perhaps the femur bone alone? 
The answer varies with our intention -~ with the context). 


Thus, models are necessary again. 


See also 'Do not use over-specialized assumptions. . .', page 252. 
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APPENDIX TO SECTION ON CURVE OBJECTS 


This appendix may be omitted in a first reading. 


Requirements for the preprocessor ihe. preprodeusst tide ‘keade’ date 
to SEE has to find only: 
1. The lines of the scene. 
2. The vertices. 


REQUIREMENTS 


FOR THE 
PREPROCESSOR 


3. The local slopes at each vertex. 
4. See also comments to figure R17. 
5. Illegal scenes (page 2(7) should be detected by the preprocessor. 
How. bad Will curved objects De T-ebjects 
where the curves edges are gently bent, SEE 
will work fairly well. The more an edge 
departs from its rectilinear equivalent, 
the worse SEE will work; T-joints will be 
difficult to find, a FORK may transform 
into a 'T', etc. (I am talking about the 
current SEE, described in the listings). 


Additional information could be used So far, we are trying to iden- 


tify objects on the basis of form aloné, i. e., geometrical considera- 
tions. This is asking a machine to do more than a human being does. 
Ambiguous line drawings, such as ARCH, become inambiguous when we 
introduce shading, lighting, texture, color, etc. All of these pro- 
perties could be used by SEE. In fact, consider how easy it would be 
to identify bodies if each one of them is of different color (and we 
could sense that fact). 
Esyehologiealevidence Knowledge of the algorithms used by human 
beings for shape continuation (page 188) is relevant. Wé quote from 
Krech and Crutchfield {1958}: 
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Grouping by Good Form. Other things 
being equal, stimuli that form a good figure 
will bave a tendency to be grouped. This 
is a very general formulation intended to 
embrace a number of more specific variants 
of the theme, traditionally classified as fol- 
lows. 


1. Good continuation. The tendency for 
elements to go with others in such a way as 
to permit the continuation of a line, or a 
curve, or a movement, in the direction that 
has already been established (see Fig. 37¢). 

2. Symmetry. The favoring of that 
grouping which will lead to symmetrical 
or balanced wholes as against asymmetrical 
ones, : 

3- Closure. The grouping of elements in 


FIG. 37. Examples of grouping. In a, the dots 
are perceived in vertical columns, owing to 
their greater spatial proximity in the vertical 
than in the horizontal direction. In 4, with 
proximity equal, the rows are perceived as 
horizontal, owing to grouping by similarity, In 
¢, the principle of good continuation reals in 


such a way as to make for a more closed or 
more complete whole figure. 

4. Connon fate. The favoring of the 
grouping of those elements that move or 
change in a common direction, as distin- 
guished from those having other directions 
of movement or change in the field. 


It seems plausible to consider that the 
percepts resulting from all of the above 
determinants would be such as to meet the 
criterion of a good figure, that is, one that 
tends to be more continuous, more sym- 
metrical, more closed, more unified. 

Now the reader will see that a difficulty 
with this general proposition regarding 
grouping centers on the crucial phrase 
“good figure.” How can we know which 


seeing the upper figure as made up of the two 

arts shown to the left below, even though 
logically it might just as well be composed of 
the two parts shown to the right below, or in- 
deed of any number of other combinations of 
two - more parts. (Adapted from Wertheimer, 
1923. 


188 


. and gray 


if 
a8 


ill 
ai 
i i 


are white continucs to guess 


the white, black, 
squares 
aod he is 


Ap 
is 
Hi 


cw 


inc 


aT A 
ahSedes 


il: 
Hy 


& 


x: 


a B 


ose 2 


he 


rm 


. 


£3 


sein til 


pals Li 
iter 


te 


mm 


suite 


F.. 1954, Some 


vial poceiion. - Paychol. Rev., $1, 183-93. 


informational aspects of 


But we are far fram being able to start 
such criteria when we deal with the highly 


complex configurations of our normal per- 


cepttal experience. Part’ of the difficulty. 


stems from the fact of individual differ- 
ences among perceivers. One man’s mess 


| ‘To escape from this difficulty, we need 
to have independent criterig of what is & 


good figure. Some approach can be made 


. configuration of ‘stimuli fs “beter” than 
another? ~ 


may be another man’s order. And this may 


reflect the important role of learning and 
past experience in the genesis of “good 


figure.” 


apply to determine the relative symmetry 


of various figures, The same is true of sim- 
ple cases of “closure.” (See Box 21 for a 


metry” there are objective rules we can 
relevant experiment.) 


to this; for instance, in the case of “sym- 
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Given the nature of SEE, we will restrict the meaning of ‘optical 
illusion’ to illusions formed by: solide, that is, ambiguities or 
inconsistencies when we (or: the. program (SEE) try to find 3-dim bodies 
in a scene; thus, the Miller+Lyer iTlusion. ("A" in the topmost figure) 


ON OPTICAL ILLUSIONS 


z 

i] 

i- a 
on 
Bes 
" 
j 


IP 
st 
me 
£,° 
s 
g 


being int ly deceived or misied tical itlusions: 
r NSIUN (2) 2 an instance of such <> of ar 
Or oe * Sat: Be: 
something cohegively eusting ‘Lp."| appear ce 
Sunt mature Se RAL LUCATION 1 Uy, oe 
pattem capable of reverme pat : by 8 te reel the 


is not considered. 


Three kinds of illusy  Aecording to thie, we may elementarily 


classify the "acenaa’ ‘that are. unlikely. to -oceur (that is, t 
that are not "standard" or "normel") in things ‘types: 
Possible but. no "good" interpretation, oe 
Ambiguous -- severa] good Anceetink ep sens, 
Impossible: without interpretation... Bat 


Like POLYBRICK {Guana}, SEE is not especifically ‘designed to 
It was primarily designed to analyze "real 
world" scenes; hence, an input scene that produces an illusion (in 


handle optical illusions. 


a human) is not likely to occur as input to SEE. 


the same way that we may overtest a program for square roots by asking 
for the square root of ‘APPLE’ NO , we may test SEE with some 


ambiguous scenes. Let us see what happens. 


POSSIBLE BUT NO "GOOD" INTERPRETATION 


because they violate rules that most objects obey. 
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finite 


Nevertheless, in 


Some objects. do not ‘make sense’ 


Nevertheless, it 


oe ae Se 


HULLSe ed es 


aoe ABD 


ACTUAL IMPOSSIBLE TRIANGLE was constructed by the author and his colleagues. 
The only requirement is that it be viewed with one eye (or photographed) from exactly 
the right position. The top photograph shows that two arms do not actually meet. When 
viewed in a certain way (bottom), they seem to come together and the illusion is complete. 


(From Gregory). 
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One of the strong rules used by humans is that objects whose pic- 
tures show straight lines have indeed straight edges; another strong 
rule is to assume the corners to be like the corners of a cube (faces 
meeting at right angles) Q - Under these rules, the above triangle 
does not make sense and people will classify it as an "impossible" 
object ( 'yARTANT*will be an “impossible” object; Penrose's Triangle 
will be "3 sticks forming ari. impossible configuration or scene; 
"mounted in a funny way’; can not be seen as representing a single 
object lying in space). For instance, Gregory {Scientific American} 
tries to explain that the triangle has a real 3-dim object as origi- 
nator, by constructing a body consisting of three rectangular 
parallelepipeds ("bricks") joined at right angles, and then taking a 
picture from a special direction, so that the free ends a and b 


seem to touch: 


b 


Pig. ‘VARIANT! 


These rules (faces meet at right angles; straight lines mean 
straight edges) are deeply ingrained into people, but nature does not 
need to follow them always. The Penrose Triangle can be obtained by 
photographing a 3-dim triangle with curved edges and skewed corners, 
where each side touches the other two. 


SEE finds three objects in figure ‘Penrose Triangle.' 


Other examples follow. 


Figure 'B LAC K' 


ao People assume that faces meet at 
right angles, and this object 


violates that rule, making it 
“{mpossible" or odd-Looking. 
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It is possible to construct object ‘BLACK’ with planar faces. See 
figure 'TEST OBJECTS' page 209, SEE finds one body in 'BLACK'. 


The object at right looks 
impossible if we assume all 
faces to be flat. If face aeb 
is curved, object is plausible 
R is its reflection on mirror 
M, and Q a. smoother version 
of R. ® looks "normal"; by 
deforming Q@ we could obtain R. 


Unlike humans, SEE does not 
hold these "very common rules" 
as inviolable; SEE does not 
have any special problems with 
these "strange but true" 
objects. 

A misleading suggestion of 
superiority should not be concluded 
from these rare cases; in other 
situations SEE makes mistakes 
that a human being does not 
(see figure 'SPREAD'). 

Of course, SEE holds its own 
rules (for example, those of 
table ‘Global Evidence') as inviolable; hence, given a "rare enough 
scene" it will make mistakes (cf. assertion in page 5! , after the 
Theorem). This is a similarity of behavior, I think, between people 


and SEE =- each one follows rather rigidly a small set of rules. 
(see also conclusion at end of section). 
Besides, often humans will see the ‘impossible’ object as an 


object, doing SEE's job just as well. 
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Figure 
"STAIRCASE! 


: thie: 


il Ini ih 
ce Hh 


l lt 
cet oe i ‘ 


Figure 14. 
“Impossible 
Object.’ This can be 
drawn, but it corresponds 
to no possible physical object. 
(From Penrose, L. S. and Penrose, 
R. (1958). Brit. J. Psychal., 49, 31.) 


(caption by Gregory) 


The "always descending staircase." {Gregory, in fFoss}} 
The caption is wrong, this object could be constructed in“real world, 
if some surfaces are curvedand/or the faces at the corners do not meet 
at right angles. Example of an object "possible but without "good' 
interpretation." See also Metatheorem on page 29 . Again, the "impo- 
ssibility" or oddness of 'STAIRCASE' comes from assuming the rules 
"straight lines in the drawing correspond to straight edges in 3-dim' 


and 'faces meet at right angles, like corners of a cube' inviolable, 


AMBIGUOUS - TWO GOOD INTERPRETATIONS 
UTZ; WC=3HLZUZHNIUAU,2N,U,AUU=UTUUUCC})}™ U(G=manmmum=» ThOege are scenes that can be 


interpreted in several correct (non-paradoxical) manners, which are 
also "sensible" (as opposed to the Trivial Solution of page 4! ). 


For instance, an scene like 


that can be interpreted as 
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(A) 


or as 


(B) 


SEE will generally give one of the possible answers, although 
not necessarily the one preferred by humans. In this example, SEE 
chose ( B ). 

The following scene, locally ambiguous, is correctly parsed by 


our program. 


Sometimes, the conservatism of SEE and its partial 
insufficiency to make very global judgements will leave a body 
unconnected; for instance, the three faces of one cube below will 
be reported each one as a separate object, due to insufficient 


links. 
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IMPOSSIBLE: WITHOUT INTERPRETATION 
ES Images that can not be product 


of photographing (projecting) a 3-dim scene. These objects do not 


have physical existence, 


This scene is without 
interpretation, meaning 

no 3-dim scene (with 3-dim 
bodies) could have 
produced it. 


In figures like the above one, men are unaware of the extension 
of the background, and s makes sense even if B is back- 
ground. SEE is unable to make this mistake, and its analysis of 
the scene will reflect the fact: the preprocessor will complain that 
one region, the background, is neighbor of itself. See comments to 
scene R3, page !I3. 
Of course, in these cases there is no answer to the question 
“which are the bodies in the scene?" Whatever answer SEE (or anybody 
else) gives, it is wrong. 
Nevertheless, according to our meta-theorem (page 33), there is 
an extremely easy way to discover and reject these imposible scenes: 
all of them are necessarily illegal scenes (q.v., page 217). And we know 
how to detect ill?gal scenes. SEE (or its preprocessor, rather) already does that. 


SEE detects all impossible scenes, by refusing the data as an 
illegal scene. 
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A PROGRAM TO DISCOVER HUMAN OPTICAL ILLUSIONS 


Some scenes get classified by our metatheorem as ‘possible but 
not "good" interpretation’, and likewise by SEE, which does not refuse 
to analyze any legal scene. 

Nevertheless, a person will stubbornly classify them as 'odd- 
looking’ or "not making sense’ or 'impossible', even if we teach him 
the solution obtained by SEE (figures ‘Penrose Triangle’, ‘Black’, 
"Staircase', ‘CONTRADICTORY'). 


{ 
6 
—> —> 


Figure ‘CONTRADICTORY’ 


One object is found by SEE: (:1 :2 :3 :4). 
As such (since it is a legal scene), SEE 
classifies it as ‘possible but not "good" 
interpretation’. A person will classify 
it as "not making 3-dim sense": a human 
optical illusion. Is it possible to 
reconcile these views? 


Of course, the metatheorem (page >9 ) insures that there is at 
least one solution, so SEE's interpretation is "right" (it has chosen 
one correct answer, generally not the trivial solution given by the 
metatheorem), and the mortal is wrong. Aliso, the theorem of page 50 
insures that any system (human or computer) that uses too "local" 
rules (see fig. 'MACHINE') will make at least one mistake, no matter 
what rules he (or it) uses. 
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Heoptical illusions 


There is thus a disagreement between SEE and our 
fellow subject, because SEE has classified the scene as ‘possible but 
no * good’ interpretation’ and our man has said ‘contradictory as a three- 
dimensional scene’. Let us call these human optical illusions (such 


as 'Contradictory', 'Stairease', etc.) by the name h-optical illusions. 
What to do in these disagreements? Who is right? 


SEE is right Above comments seem to indicate that the electronic 
data-processor is correct. The human has used excesively "local" 
rules. That being the case, we can teach and train (if avoiding 
future errors is desirable) our subjects to "understand", racionalize 
and make sense out of these h-optical illusions. Indeed, that is what 
is tried in figures 'Black', ‘Penrose Triangle’, etc. Different 
people may show different degrees of (Hvoptical) illusion before 
training and after training (see Box). This training is possible 
(see Box). 

In other words, if SEE is right, the computer scientist has 
nothing to do, it is all up to the psychologists and educators. 


Man is right we may hold the view that the human answer is still 
preferable. Then, to our relief, man is right and SEE is wrong. 
It is necessary (perhaps) to modify and correct SEE, so as to emulate 
personal behavior. * We suggest a way to do this. 
everoetem tod terguer Roope teal 1 iiuesene It is possible to enable 
SEE to detect these h-optical illusions, so that it will classify the legal 
scenes into "possible" or "h-optical illusions." 

As the problem of discriminating between background 
and objects (see section ‘On background discrimination by Computer'), 
this is an interesting project from the "paychological" point of view 
but, as in the background case, it is not essential at the moment 


for our vision-robot work, 


* 
Strictly, there is a third possibility: both are wrong. 
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BOX 


There is generally a wealth of available information—though none entirely 
reliable—for settling the size and distance of external objects, with sufficient 
precision for normal use. As is well known, the visual system makes use of 
a host of ‘depth cues’, such as gradual loss of detailed texture with increasing 
distance, haziness due to the atmosphere and nearer objects partly hiding 
those more distant. These cues were discussed in the nineteenth century 
by the great von Helmholtz (1925), who fully realised their importance, and 
they have been the subject of many investigations since, especially by 
J. J. Gibson (1950). Whatever the richness of depth cues, however, the visual 
input is always ambiguous. Though the brain makes the best bet on the 
evidence—it may always be wrong. 

The kind of mistakes which occur when the bet is on the favourite though 
the favourite is not placed, is shown most dramatically by the demonstrations 
of Adelbert Ames (1946). The most impressive demonstration is given 
simply with a room which is non-rectangular, but so shaped that it gives the 
same retinal image as a rectangular room to an eye placed in a certain 
position. Now clearly this room, though queer shaped, must appear the 
same as a normal rectangular room, for it gives the same image to the eye. 
But consider what happens when objects are placed inside the Ames room. 
The further wail recedes at one side, so that an object or person standing in 
one corner is actually at a different distance than is a second object placed 
at the other far corner, These objects (or people) appear, however, to be 
at the same distance—and they are seen the wrong size. This is clear evidence 
that we assume rooms to be rectangular (because they usually are) and we 
interpret the size of objects according to their distance as given by this 
assumption. When the assumption is wrong we see wrongly. What Ames 
did was to rig the odds, and then we make the wrong decision on size and 
distance. A child may appear larger than a man. We may know this is 
absurd and yet continue to see a bizarre world. The retinal image is all 
right, but the odds have produced the wrong internal file cards and then the 
human seeing machine is upset, and gives a wrong answer. 

It is interesting that the Ames room is seen correctly by peoples, such as 
the Zulus, brought up in a ‘circular culture’ of beehive huts where there are 
few reliable perspective features, such as rectangular corners and parallel 
lines, in their visual environment. To the Zulus, the odds are not rigged by 
the Ames room—to them this is not misleading perspective. They are not 
subject to this illusion, but accept the room as the shape it is, and see the 
objects in it correctly in distance and size. This is a matter of very real 
importance. It shows that when we are transferred to an alien or bizarre 
environment, where our filing cards are inappropriate, we interpret the 
images in the eyes according to principles found reliable in the previous, 
familiar world—but now they may systematically mislead and then percep- 
tion goes wrong. Space travellers beware! {Gregory, in {Collins 

and Michie}} 
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A possible way to attack the problem ¢s 
(1) To identify each link with whoever proposed it. 
(2) To set up systems of simultaneous "symbolic" equations. 
(3) To solve them by elimination. 


We elaborate: 


(1) Mark each link with the name of the heuristic that produces it. 
After obtaining the 'maximal' nuclei by GLOBAL and LOCAL, seve 
ral links are left (for example, three in fig. 'FINAL-BRIDGE') 
and ignored by the current SEE. Instead, one could see what 
kind of links they are, and one has in this way more informa- 
tion about the type of contradictions in the scene. 

(2) Introduce a ‘conditional' link: regions :l1 and :2 belong to 
the same body if region :3 does not. An OR link is now possi-~ 
ble by use of the conditional, since a mb -=- bVta. 

(2.3) Introduce a 'NOT' link: :3 # :5, regions :3 and :5 do not 
belong to the same body. 

(2.6) As in ordinary algebraic equations, a system of n simulta- 
neous equations means that all of them must be satisfied; 
the “AND" of all must be true. Thus, AND is implicit in our 
notation. So far, we have OR, AND, NOT, IMPLIES (conditional): 
we have more than necessary. 


At the end, we have a system of simultaneous equations 
like these, where :1 = :2 means both belong to same body; this 


is an equivalence relation so I use the = sign: 
sl=:2 OR 3:3 2:5 
33%:2 sD :12:4 (E) 


{ 
We now procede to "solve" these equations. Three things could happen: 


== Exactly one solution is found. This is the normal case, and 
that solution tells what the bodies are. Femiliar, "clear", possible 
scenes will fall in this case. 


== More than one solution is found consistent with our equations. 
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All are reported. This is the case "Ambiguous -- several good 


interpretations." 


No solution is found. This is a genuine keptical illusion, 


corresponding to a contradiction in the equations. For instance, in 
fig. 'CONTRADICTORY', equations set by the T-joints between 1:2 and 
:3 would be inconsistent with those set by the Arrows.and Forks. 


How to solve the equations (E) by the solution to (E) we mean a division of 


the scene (:1, 


2:2, ..., :m) by means of a partition of the form 


(:1 = :5 = :7 = :6), 
(:3 = :2), 


which is consistent with (E). 


In the current SEE, 
(a) The equations are only equalities: :1 = :2. 


Also, equations of the type :1 # :2 are taken into 
account by inhibitory mechanisms, such as NOSABO. 
No conditional links exist. 


(b) Since all equations are of the type :2 = :3, the solu- 


tion is obtained by applying transitivity, that is, 
1=2 
223 


Ly - © 


Except that we require two antecedents for application 
of transitivity (two strong links): 
1=2 


hee parentheses 
=> (1 = 2 = 3) indicate nuclei. 


7 is3 =» O23) 
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An exhaustive saarch (which successively tests each possible parti+ 
tion) of the solution to (E) is impractical except in very small 
scenes, and heuristic methods are needed. 
I suggest to start from the equalities such as 1 = 2 
2=3 
and to form nuclei*#ith the current SEE, except that at each step 
we check to see if our current nuclei satisfy all of (E); for 
disjunctive equations such as " 4 = 5 OR 6#7 OR 4=6" 
we try each branch of the OR in turn, rejecting those who conduce to 
no solution (this may be pretty combinatorial, too). 
Perhaps it is possible to use more Logic here -- some sort of 
theorem proving, 
fonclusions and conjectures The similarities between SEE and people 
(see also 'Human perception vs. computer perception, page25% stem 
from the fact that, like SEE, people seem to use only a small number 
of rules (although not necessarily those used by SEE), which work in 
almost all cases, but when these rules conduct to an ambiguity or 
inconsistency ("conflicts"), there is reticence to abandon them, and 
mistakes or impossibilities are produced. 

It is possible that, like SEE, people use primarily local clues, 
and with less frequency more global information to disambiguate 
interpretations. I think that, in the presence of objects (in 2-dim 
line drawings, such as 'MOMO', for instance) not seen before, humans 
follow general rules not unlike those used by SEE to distinguish 
or decompose a scene into bodies. Rules that apply to all polyhedra 
have to be invoked, since in presence of previously unseen objects, 
humans can not use a model of the object. 

The more familiar an object fs (or if we have reason to suspect it 
or expect it), the faster we abandon the general rules and propose its 
model as a possible explanatinn of part of an scene; we then jump to 
a model matching routine (a la DI {MAC TR 37}) that tries to fit the 
model to part of the scene (to a semi-isolated body); general rules 
a la SEE prevent us from overflowing with our model into other bodies, 


and help us to deal with partially occluded bodies. 
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ON NOISY INPUT 


The performance of our programs is analyzed when the data has 
imperfections consisting of (1) misplaced vertices, (2) missing 
edges, (3) spurious extra lines, (4) missing faces, (5) two vertices 
merged. 

The section ‘Analysis of Many Scenes' contains results of SEE 
when applied to imperfect scenes. 


Summary it is easy to predict the operation of SEE when the two- 
dimensional data supplied is clean, in the sense of being an accurate 


representation of the three-dimensional scene. 
In practice, of course, errors will occur in the data and it be- 
comes important to know how sensitive our program is to them. 

SEE has some serendipity. Many of the imperfections in the 
data do not cause mistakes in the linking procedure, or the link 
misplacements are not enough to cause erroneous identification. 

But mistakes are made. 
Here is how different types of imperfections are handled: 


== The assignment of types to vertices is highly insensitive to errors 
in the position of each vertex, except T'S that become Forks of 
Arrows. Two cures to the exceptions were found, only the first 
of which is implemented: 
(1) Allow tolerances in concepts of parallelism and colinearity. 


(2) Allow a long but slightly twisted rectilinear segment to be 
“*straightened", as indicated in comments on scene R17. 


== Missing edges are subdivided in three classes (discussed below); 
two of them produce recoverable or detectable errors (hence, 
susceptible of correction or prevention), It will be difficult to 
detect if a segment of the third class is missing; these will pro- 
duce recognition mistakes. 

sz: Additional lines, like the ones caused by edges of shadows, are not 
easily detected as spurious or superfluous. Their presence mainly 
produces a diminution in the number of useful links, thus some- 
cimes causing too conservative behavior -- i.e., proposition of too 
many bodies. 

a= Whole faces may be missing. Ordinarily (see scenes L2, L9T). 
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wee eh Bent oy en ii tapering: cue cried ter gts sae Std 


the remaining part of the body gets correct ly identified. 
OBTAINING THE DATA 


The scenes analyzed by our program in this thesis were obtained 


by one of two methods: 


By free drawing A line drawing representing threé~dimensional objects 
was made; the coordinates of each vertex were accurately measured (or 
computed) and the information was put in the ‘Input Formag¢' form 
previously described. Also the regions belonging to the background 
were indicated as such. 


These scenes have mnemonic names such as TRIAL, BRIDGE, etc. 


What kind of projection did you use? Were these isometric drawings? 


Since no assumption is made -on the rectilinear objects being drawn, 
the drawings are not isometric, or perspective, or ... projections. 
They could be any of them. It is not assumed that "we are dealing 
with prisms, with faces of a body meeting at right angles (like the 
corners of a cube) eich convex objects. Neither the drawings nor 
the program make any assumption of this type. If the reader wishes 
to adopt the assumption specified above in quotation marks, then the 
drawings will correspond to orthogonal projections of three-dimensional 
scenes. : 

No support hypothesis is needed: if necessary, the Objects could 
be floating in a transparent fluid having their same density. 


Eycomstrucesce Arbitrary but not too complicated objects were cut 


from pine wood, with flat surfaces, and painted black. Their edges 
were painted white. By placing them on a black table (see first few 
pictures of this thesis) in different positions and combinations, 
three-dimensional scenes were created (see figure ‘TEST OBJECTS"). 
Pictures were taken with high contrast film slightly under-exposed 
so as to render black everything but the lines. Diffuse illumination 
eliminated shadows [Great help was received in the pictorial task 
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806 


Figure 'TEST OBJECTS' 


Some of tne objects used to 
produce many of the scenes 
for our programs. Objects about 10cm long 


60¢ 


Fisure 'TEST OBJECTS' (Cont.) 


Some bodies analyzed by SEE. 
Tne arrow indicates a body that 


produces an optical ilusion. It 
is unfamiliar, but real. 


from Messrs. William H. Henneman, Devendra D. Mehta and David Waltz, 
and is here acknowledged]. The photographs were taken with a depression 
angle from 45° to 90° (that is, looking down), 50 mm focal length 
lens, 35 mm camera (standard equipment). 
The size of the prints is approx. gk by 11 inches (21.5 by 28 cm). 


2 
If some lines were not clear, they were retouched with white ink. 


Tf some lines were missing, they were NOT added. 

The pictures have names like L2 or R3, a letter and a digit. 
Most of them are stereographic pairs, taken with both cameras having 
parallel optical axes, and the sensitive film.on the same plane. 
SEE only analyzes one scene at the time, so the left picture is not 
consulted when SEE analyzes the right picture, and viceversa. 


A transparent millimetric mesh ia laid on top of the prints, 
and the coordinates are read by eye and put by hand in the ‘Input 
Yormat' form. The thickness of each line is about 1 mm (see figure 
‘TEST OBJECTS'); typically, the size of a scene is 10 or 15 em: a 
minimum error of +1 per cent in the coordinates of a vertex is al- 
ready present. The slopes and directions of short segments suffer, 
naturally, much greater errors. Also, if two vertices are too close 
together (about two millimeters) they are merged and codified.as one. 
We are simulating the kind of mistakes that are likely to occur. 

Also, some bias is introduced » ho doubt), by the human operators. 
{By reading the coordinates in most of the scenes, immense help waa 
given by Miss Cornelia A. Sullivan and Mr. Devendra D. Mehta; the 


author acknowledges it.] 


Irrespective of the generation method, the scenes that appear in 
this thesis were drawn in their final form by the PDP-6 computer 
through a Calcomp plotter, and then inked and finished by hand . 
Thus, it is possible to perceive in many of them thé imperfections 
of the data that SEE had to analyze, 
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MISPLACED VERTICES 


The coordinates of a vertex may contain a small error or ‘noise’. 


How does this affect the type of a vertex? Does the type change? 


L. / ——» / Not affected 
’ 


Not affected 


=p 
ARROW —_ a > Not affected 
K Transforms into MULTI. 


TS 
| 


Transforms into MULTI. 


| 


Not affected. 


a 
—/ Transforms into ARROW 
Bray ey 


Transforms into FORK. 
PEAK. F 
MULTI. Ne —> ‘ Not affected. 


Many types are unaffected. Type K vertices transform into 
MULTI, but since K's are seldom used by SEE, this is no big loss. 

X's transform into MULTIs, and we lose two links here, which 
makes SEE to behave more conservatively. Also GOODT gets affected 
(though not much), 

The serious change are the T's that get transformed into ARROWs 
or FORKs, when these T's are matching T's. Because they are used 
for linking otherwise disconnected pieces of a body, their loss 
generally implies the partition of a body into two. See figure 
"DISCONNECTED' . 
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(b) (Ce) 


Figure 'DISCONNECTED' 


The T's under discussion are marked by 
small circles ( @ ). In (a), the mis- 
classification of these T's into Arrows 
or Forks does not break the occluded 
body, who retains its unity thanks to 
tl. In (b), the same mis-classification 
does brea& the occluded body, reporting 
two objects instead of one, a possible 
but less desirable answer. If the T's 
are not matching T's, as in (c), their 
mis~classification does not matter. 


The loss of matching T's makes the program to be more conserva- 


FANG: AO GONE CRBs. EDS OEE DESIRABILITY CRITERION. 

(1) We would like a SEE that never makes 
mistakes. Sincethis is not possible, 
then 

(2) We would like it to make mistakes of 
only one kind, either join: two 
bodies that should be left separated 
(intrepid, cavalier behavior), or 
leave unattached two nuclei that 
should be reported as a single ob- 
ject (conservative behavior). 

(3) Among the two, we prefer a conserva- 

tive SEE, because its errors will 

be easier to correct (cf. Stereo 

Perception). 


sense (see ‘Desirability 
Criterion’) this is tolera 
ble. 

What other perils does 
the misclassification of 
the T's bring? We should 
worry if, due to errors cau- 
sed by T's, the occluded 
bedy joins the occluding 


one. 


The T's should not originate 
the reporting of :1-2-3 as 
part of one body 


Each T, when perturbed, will go to one of these states: (N) normal, 
& 


unperturbed; (L) "left", E, moves towards Ej; 2 becoming 
a a FORK, or (R) "right", when E, moves away 
iN 
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from Ej; “ fe, becoming an Arrow. 


For three T's of an occluded body, 3° = 27 states are possible. 
They are shown in next page, in table 'THREE Ts’. 


How many of these 27 states will produce an I 
mis-links joining 1 with 3 or 2 with 3 Il 3 
or 1 with 4 or 2 with 4 (mone of the four v 
regions is necessarily background) 7? TIL 
VJ 
None. 


The reason is that (see description of NOSABO) a T or an Arrow 
or an L inhibit the link shown below, 


ee 


so that (a) An arrow in position (I) [or (III)] suggests linking 1 
with 4. This link is inhibited by the L at IV [or VI]. 
Example: Figure R L L in Table 'THREE Ts'. (P*§e 2/4). 
(b) A Fork in position (I) [or (III)] suggests 
(1) linking 1 with 3. Inhibited because of the T or 
arrow in vertex ITI. 
(11) linking 1 with 4. Inhibited because of the L in IV. 
(iii) linking 4 with 3. Depends on outside considerations. 
Discussed below. 
Example: L RL. 
(c) An Arrow in position (II) suggests linking 1 with 2. 
Inhibited or allowed according to vertex V. Example: RRL. 
(d) A Fork in position (II) suggests 
(1) linking 1 with 3. Link inhibited by the T or arrow 
of I. 
(1i) linking 2 with 3. Inhibited by the T or arrow in III, 
(111) linking 1 with 2. Inhibited or allowed according to 
vertex V. 
Example: RLN. 
Thus, no link is possible, even under these "noisy" circumstances, 
between 1 and 3 or 2 and 3 or 1and4 or 2 with 4. That is, 
the 27 cases of table 'THREE Ts' are treated correctly. 
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A possibility of bad linking exists between 4 and 3 in this 


case, if two T's convert into forks and "help each other": 


CY 


Two links originate 
the joining of 4 
and 3. 


ee, 
AMM) 


Rather than get involved in this sub-problem, we will point 
out two solutions to the misplaced vertices: (1) by allowing 30me 
tolerance in 'parallel' and ‘collinear'; (2) by ‘straightening out' 


crooked or twisted segments. We explain. 


Equal within epsilon (definition) a is equal within epsilon to b, 
€ 
written a==b, iff {a-b| <(l€l. Generally, €>0. 


Tolerances in collinearity and parallelism Two lines are parallel if 


the sine of the angle formed by them is smaller than SINTO. (sine ==o0) 
Currently, SINTO = 0.15 Q, eencin-- == ee 
Lines ab and be are colinear if b 
length ab + length be length ac. Currently, COLTO = 0.05 
We have implemented these definitions. Better definitions exist. 


These definitions allow most small inaccuracies in the coordinates 
of vertices to pass unnoticed. Although they are giving reasonable 
service, they are only temporary, since by relaxing too much the 
criterion for parallelism and collinearity, strange things could 
happen (fig. 'CROSSED'). 


Fig. 'CROSSED' 


A too lenient definition of parallel 
and collinear could gftve the follo- 
wing matching T's: atod, b to f, 
c to e. 


See also on section ‘Analysis of many scenes' comments to L9 and ROT. 


e*gts 152, »5L). 
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Straightening twisted segments 


The definitive cure is simple: 
reassign the slope of be tc be that of ad, if be is small, ad large 
a 


d 


and the angles at b and c are close to 180°. See also comments to 
figure R17. This has not been implemented. In this way, all cases of 
table 'THREE Ts' will be solved. See also comments to scene R4. 

Probably the preprocessor will automatically take care of this 
rectification, since it may prefer to give a long segment ad instead 
of three almost collinear shorter segments ab, bc, cd. 

Since the straightening of a segment replaces some known vertices 
(which we suppose inaccurate) by other idealized vertices, we may be 
introducing uncertainty, in the form of non verified hypotheses, to our 


data. The object in the scene could really be "crooked" or twisted. 


Fig. ‘TWISTED! 


The object to the left is really bent as shown. 
If we idealize it as in the right, we are falsi 
fying the information about it. 


By replacing it by an idealized version, we may be creating 
problems for its identification, when we want to assign a name to it. 
But notice that the 'unbent' version or idealization is handier for 
SEE. 
see be eee Oe is ery ed Throw it away and read the scene 
again. A simile indicates that the issue becomes one of allocation 
of resources: if you receive a written message cantaining a few 


wrong characters and missing words, you may use your brains and time 
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to deduce the omitted portions (by employing the redundancy, for in- 
stance). If the dispatch is very garbled, you might as well request 
a new one. 


: FY it is known how to handle small inaccuracies in the position 


of the vertices. 


MISSING EDGES 


From time to time, an edge will fail to show up in the scene, 
and the questions are (1) how much harm will be produced, and (2) 
how can we detect and correct the anomaly. An example appears in 


page 141. 


iitegal. Scenes. Lines that end abruptly produce illegal inputs, 


suggesting that segments are missing. 


(a) 
Fig. 'ILLEGAL' (b) 


In (a), a vertex has one edge. 

In (b), the network can be separated by erasing 
just one edge. 

Both are illegal scenes, indicating missing or 
extra lines. 


Also (Figure 'ILLEGAL', (b)) a region can not be a neighbor of 
itself -- another irregularity that points to deficient data. Cf. 
comments to scene R3. (wget '3), 

These constraints can be nicely exploited by a preprocessor. 


Line proposer and line verifier A line proposer is # program that 


suggests places where a line can be missing; a line verifier is es- 
sentially a precise line finder that searches a line in only a small 


portion of the scene, as told by the line proposer. 
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In the body of this section we will develop several heuristics for 
use ina line proposer. The verifier is not discussed. 

Blum!'s line proposer An algorithm has been designed by Manuel Blum 
{1968}, that will detect many places where lines are possibly missing. 
It suspects concave regions. An angle bigger than 180° originates a 
search for the omittedline in directions parallel to the neighbor 


’ ‘J 
oe 
N 


Figure ‘BLU M' 


Region :2 is suspected to contain undetected lines, 
because it is concave. Vertex v is chosen becau- 
se its internal angle is bigger than 180 degrees. 
From it, Blum's proposer will suggest to the line 
verifier to look for lines in directions VA" and 
VB' (broken lines), parallel to the neighbor edges 
A and B. It also searches (dotted lines) along 
the continuation to lines C and D. 


edges (fig. 'BLUM'). It also originates searches along its own 
edges. In other conditions, a vertical line is searched. 
No harm is done by a bad proposer. Only some time is wasted. 


internal edges If a missing line *s totally internal to a body, and 


is not detected by the line proposer, its absence will at most cause 
conservative vehavior in SEE. In some cases their absence does not 
confuse SEE (figure 'MISSING'). 

The majority of internal edges cause concave regions to appear 
(fig. 'BLUM'). They will be detected by a line proposer. 
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External edges 


Fig. 'MISS ING' 


Cases where the disappearance of an internal 
line (dotted) does not separate the body. 

In (a), the object separates into two. 
This case is recognized by Blum's heuristics. 
Else, SEE could check for this configuration 
as a special case. 


Edges that separate two bodies are called extemal. 


If undetected, their disappearance will cause ‘intrepid’ errors hy 


SEE, which are undesirable (see 'Desirability criterion' in page 212). 
Two cases result: (1) Only part of the edge disappears; there is possi- 
bility of correction, (2) The whole edge is both external and missing 
(and the scene is still 'legal'): a mistake will occur, See figure 
'External Edges'. 


Case (1) Only part of an external edge disappears. It can be 


detected because - 
(a) a concave region is generated, and . 
(b) the region has internal angles big 

ger than 180° where a line "goes a 


through": ab is colinear with cd. 
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Figure 'EXTERNAL EDGES' 


A segment separating two bodies may disappear. 
(1) If that segment is part of a larger segment, 
it is possible to sense and correct the anomaly. 
(2) If a whole external edge is missing, its 
absence remains undetected, inducing a mistake 
in SEE. In (i) an external edge disappears, and 
creates an illegal figure. 


Case (2) The complete edge is missing. Then (b) of case 1 fails, 


and detection is difficult. 


SPURIOUS EXTRA LINES 


They are lines that "should not be there", such as those 


caused by edges of shadows. 


=—=——_ 
SS, 


ae 
rill 

Fig. ‘LIGHT Wy SHADOW' 

Each body becomes two; each one is recognized 
independently by SEE. Four bodies are found. 
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Shadows of rectilinear objects travel in planes that (in theory) 
part an object in two (or more): the illuminated part, and the dark 
one. Each is a separate object by itself, according to our definition 
(see 'Several definitions of a body'), since they have plane boundaries, 
SEE should recognize them. 

In practice, we have not tried our program with scenes having 
lines produced by shadows. A conservative behavior, like in figure 
"LIGHT AND SHADOW', is expected. 

Some shadows gradually diffuse; multiple lights cause multiple 
shadows. These problems may have to be solved by assuming or compu~ 
ting the direction or position of the light sources. 


MERGED VERTICES 


Two vertices fused in one will produce diminution in the num- 
ber of useful links they report, since the resulting vertex will 
be of type MULTI. Thus, conservative behavior is expected from SEE 
in these cases (see Fig. L19, L17T, RI7, L4, etc. The program does 
well in them, when not too many coincidences are present). 

It is possible to analyze the vertices of type [svaczstion | 
MULTI and try to decompose them in simpler types (compare figure — 
R19 with WRIST*). Read comments to R19 and 19. 


CONCLUSION 


On scenes obtained from "real world" data, inaccuracies are 
expected, and it is required of SEE to work well despite then. 
Currently, the behavior of the program in these cases is not 
discouraging, but is not extremely satisfactory, either. The 
‘additional work needed depends heavily on obtaining genuine 
test data, instead of the faked data used in the experiments 
described. 
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BACKGROUND DISCRIMINATION BY COMPUTER 


A program détemnines the regions that belong to the background 
of a given scene; that is, the regions that are not members of any 
of the bodies. Examples are given. 
pecs. The program SEE requires to know which regions of the scene 
belong to the background (cf. 'SEE, a program that finds bodies in 
a scene'), At present, this information is supplied by the user, 
as described in sectiom ‘Internal format’ (page (¢ ) and ‘Input 
Format’ (page 63 ) of a scene. 


In the current vision experiments, it is not difficult to 
determine the regions that form the background, since they are always 
black and homogeneous (see first few pictures in this thesis). But 
in more realistic scenes, there will be a great demand for a background 
finding program. 


Therefore, it is interesting to try to 
develop a program to separate the "ground" 
in the back from the objects in the 
"foreground", having a limited information 


consisting of the scene as described in 
section ‘Internal Format', namely, vertices 
and edges. 

That is, we will use in this task only 
"geometric" properties. 


Such program has been written, and works automatically under 
the command of PREPARA, the function that converts a scene from its 
‘Input Format' to its ‘Internal Format'. When the regions forming 
the background are not supplied, PREPARA activates our program, 
named BACKGROUND, and these regions are searched for; otherwise, 
SEE is supplied with the background regions as declared in ‘Input 
Format'. 
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Example. Scene 'HARD'. The results obtained are 


(SUSPICICUS aRE NIL) 

THE BpCKGROUND OF RAaRD IS 
(334 836 335) 
(334 336 335) 


Three regions are found to be part of the background: :34, :36, 
and ;35. That is correct. 


We now proceed to describe the subroutines that make such 
identification possible. 
euspectous In a first pass, we collect the regions that "may be" 
background, and call them “suspicious regions", Regions that are 
not suspicious are LIMPIO (clean). 

Ideally, if a region :R contains L's, FORKs, ARROWs or T's in 
the position below, it is not a part of the background. 


:R 


(1) (rr) (rrr) (rv) 


FIGURE 'BACKGROUND' 
In an idealized situation, :R can not be part of the 
background: it is clean, or free of suspiciousness. 
tR will be called 'LIMPIO' (clean). 
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(I) means that the background [almost] never is the internal 
part of an 'L' (the region containing the angle smaller than 
180 degrees). 


(II) means that the background does not contain FORKs. 


(III) means that the background igs not in the "inside" of an ARROW 
(the background is not a ‘proper*’arrow'). 


(IV) means that the background can not be the flat region of a 'T'; 
this in turn means that a body can not disappear under the back 


ground and then reappear at some other point: 


2:3 is not the background. 


We reinterprete rules (I)~(IV) as follows: 
(I) A region "inside" an L is LIMPIO (clean). 
(IL) A region containing a fork is LIMPIO. 
(III) A region "inside" an arrow is LIMPIO. 
(IV) A region "on the flat side" of a T is LIMPIO. 


Clean Vertex (definition). A vertex is clean with respect to a re- 


gion if it indicates, through rules I-IV, that such region is LIMPIO. 
For instance, K is clean for :l and for :2, 
since (III) indicates that :1 and :2 are LIM- 
PIO. K is not clean for :3. 


These heuristics are not 100 per cent infallible; also, in a 
moderately complicated scene, coincidences of vertices are bound to 
occur, originating violations to I-IV. For instance, in figure CORN 
(page 150), vertex UU is a Fork belonging to the background, in con- 
tradiction with (II). 
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For completeness, we present a violation to each one of rules I-IV: 


(1) 


—— 


——— 
My, 


(rtd) (iv) 


FIGURE 'VIOLATIONS' 


:1 is the background. In all four cases, 
vertex V violates rule specified at the 
bottom of figure. They are rare cases. 

The situation indicates that rules I-IV 
provide noisy information, which has to 

be dealt with carefully. That is what is done, 


The vertices of each region are analyzed under rules (1)-(IV). 


To allow for coincidences of vertices and rare cases (like those in 


figure 'VIOLATIONS'), it is permitted for a suspicious region to 


have a small mumber of clean vertices. 


The number of clean vertices is compared with a quantity that 


is a small fraction of L (the number of vertices on the boundary); 


currently, that fraction is L/9. 


== 


If the number of clean vertices, that is, vertices satisfying 
I-IV is bigger than L/9, we call that region LIMPIO ("clean"). 
In addition, (a) If L is large (bigger than 25, currently), 


that region is BIGFACE, such as :21 of 
scene L19 (page 144); 
(b) Otherwise, it is only LIMPIO (normal case). 


If it is not bigger than L/9, then it {as SUSPICIOUS. Also, 


(a) If L is large (bigger than 25), the region 
is BACKGROUND, 
(b) Otherwise is only SUSPICIOUS (normal case). 
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That is, a region LIMPIO has to have at least c 


1 + [one vertex of each nine] 


"clean" vertices. 

Example. Region :3 has four 'clean' 
vertices (four vertices indicate that :3 
is LIMPIO) --~ It can not be SUSPICIOUS. 


cf 
Figure '‘EKQUILIBRIUM' 


(This scene is correctly analyzed by SEE) 
All the three vertices of :1 are not clean; 
11 will become Suspicious (a candidate for 
background). Five of the seven vertices of 
:2 are clean, so :2 is LIMPIO. Note that 
vertex C' is clean for :2 and not clean 
for :1. 


For example, when we apply the function SUSPICIOUS (see listings) 
to every region of scene SPREAD, the suspicious regions turn out to be: 
Suspicious only: 135 1:18 1:34 1:2 13 1:12 sll 133 137 
47 348) «346. 
Background: :48. 


Summary, By analysis of its vertices, each region is either LIMPIO or 
SUSPICIOUS. The suspicious regions with more than 25 vertices are 
classified right away as BACKGROUND: a suspicious region with many 
edges is probably background. 

The selection is done entirely using "local" properties: a 
region is classified according to information supplied exclusively 


by its own vertices. 
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FIGURE ‘S PREAD' 


Each region is.clessified: an LIMPIO, 
SUSPICIOUS or BACKGROUND. Se 


More global indications 5,- goal ts.to decide weteh of the suspi- 

cious regions are LIMPIO, and which ones are. BAGKGRDUND, 

== Since two background regions can not’ be contiguots ( the back- 
ground can not be neighbor of itsel2y, ‘gappietaie regions that 
are contiguous with the backgiound are’ Cleshed and put in the 
LIMPIO status. ad = Eee BONG 

In our example, :48 is background and theréfore its sus- 

picious neighbor :18 gets cleaned ‘aad tectnes LITO. 


= Links are established through the matching T's. We call them 
b-Links. |. westeeting Vode edeiate Bey 2 _* 
Ideally, a suspicious regionblinked to, LIMPIO, region 
gets cleaned, a suspicious regionblinked to the background gets 
converted to background too. | ws me 
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Idealizing, suspicious region :1 
becomes LIMPIO, and suspicious 
region :2 becomes background. 

A more complicated procedure is 
actually used. 


In practice, we allow for small errors as follows: 


For each suspicious region, we notice if it is b’linked 
to background (BA), suspicious (SO), or Limpio (LI). 


BA == == [If it isblinked to background regions, we 
change it to Background, except if it has a 
background as neighbor, in which case we do 
nothing and continue. 


() sO LI If notblinked to background, butbvlinked both 
to Suspicious and Limpio regions, 

(1) If LI < SO, continue, do nothing. 

(2) If LI 2 SO, classify this region as 
limpio (LI is the number 
of LIMPIO regions b-linked 
to the current region un- 
der consideration). 


() SO () I£tLinked only to suspicious, continue, do 
nothing. 


() © LI If linked only to Limpio, change it to Limpfo. 
Note: Sometimes I write Limpic, sometimes LIMPIO, 
they mean the same. 


QO © © If not blinked, continue, do nothing. 
We keep applying these rules until no change is observed. In 
this way, we have eliminated several suspicious regions. 

In SPREAD, the suspicious regions were 35, 18, 34, 2, 3, 
12, 11, 33, 37, 47, 48, 46. 1:48 is known to be the background 
(that was done in page 22¢), soit is no longer suspicious, 118 
is a neighbor of the background (1:48), and got cleaned in the 
page before this one. 

811 isblinked with the LIMPIO :9 and with the suspicious :3. 
Therefore, :11 changes to LIMPIO. 

23 is blinked with the Limpio :11, so the suspicious :3 be- 
comes Limpio. 

112 isblinked to the Limpio :10, and gets cleaned. 
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246 is blinked to the background :48, and gets made 
background, since :46 is not, at this moment, a neighbor of 
background. 

234 isblinked to the background :48, and gets made 
background, since :34is not a neighbor of background. 

237 isblinked to the LIMPIO region :4, and transforms 
into LIMPIO. 

135 isblinked to the region :34, which is background, 
so that the suspicious region :35 becomes background instead Jvinus48. 

12 is a suspicious regionblinked to the reginn :35, which 
is part of the background. According to our rules, :2 becomes 


part of the background. :2 <s «ho bhixked & the background : 4g, 


At the end, only regions :33 and :47 remain suspicious: 
(SUSPICIOUS ARE (:33 3147)) 


We collect all these 'stubborn' suspicious regions and label 
them background, except those which are neighbors of background. 
A better procedure may be to make the exception in 
those regions that are neighbors of suspicious re~ 
gions. That is, two neighboring suspicious regions prevent 
each other from becoming background. I have not explored 
this possibility. 

In the example SPREAD, :33 and :47 are made background. 


If no region is background at this point, make ene of the "big- 


faces" background. There is room here for improvement. 


If no background yet, make background the region with most 
vertices. This is not yet implemented. 


In our example, the (final) background regions are: 


233 347 135 334 12 348 246. <- BACKGROUND OF 'SPREAD'. 
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Other examples of background finding. 


Scene CORN 


LLEVA 
Fuur 
SLULMeENIKALUR 
Tyerornischaluk 
MATES 
NWOAGA 
DEARUMITNG FOR BaCAGRUUULS GF tureN 
(oUSsPLelOua ant Nic) 
ime CAUCRGT I Jive ae ety [ES 


(322) 


Scene BRIDGE 


(#30 15 sibracry 
(SuoPiciCus are iby 


IRE pakCK GX JUN UF Bb4ibsd [% 
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S MO 
cone MOM one mistake (:31) is produced here. 


LLeNAa 
Fuur 
Sueur stor adur 
TY¥rPrerNerAruUr 
Mayes 
Noa LE 
SeanCal Nh Fur sate uRklusuSs uF qa" 
(avoPlelouS are (8S1)) 
THE BACKGRIGNED CF wie 1S 
(#6 #3) ¢a40) 


FIGURE —‘MOMO.’ 
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The problem is ambiguous rite in the case of body isolation (section 
"The Concept of a Body'), the problem of determining the regions that 
belong to the background of a scene (regions that belong to no body) 
is ambiguous; many solutions are possible, as long as no two back- 


ground regions are contiguous. 
Among the multitude of solutions there exists a preferred one, 


which is "the" standard (common, familiar) interpretation chosen 
by people. 
Our program tries to choose also, among the many solutions, 


the standard one. 


Sram ey. A lenient algorithm finds regions (by analyzing the types of 


their vertices, and their neighborhood relations) that may possibly 
be background, and labels them "SUSPICIOUS". With the idea of 
re-classifying the suspicious regions as ‘LIMPIO' (clean, no back~ 
ground) or 'BACKGROUND', a system of b-links is introduced. These 
b-links provide more global information about the scene. 

Members of the suspicious set are assigned to one of the other 
two sets (linpiobayoul) while the algorithm tries to minimize the b-links 
between Background and Limpio regions. 

Sonslusioe Fair results are obtained with the algorithm just 
described. Sometimes, regions are obtained as Background that 
are genuine components of a body ("Limpio") and vice versa. 

Refinements are needed, but since in our present vision experi- 
ments the background is a homogeneous black area (see first few pic- 


tures of this thesis), no emphasis is shown right now. 
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STEREO PERCEPTION 


Sunmiary So far we have discussed the identification of objects in a 


scene and ignored the problem of locating them in a three-dimensional 


space. 


There are several ways to achieve this. We will discuss here one 


of them: the use of more than one view of the same scene. 


A natural first step is to establish the correspondence between 


points in the two views; that is, given a point in one scene (left), 


to find the corresponding point in the other scene (right). Theorems 


S-1 below and S~2 on page 
234 express criteria 
for this "stereo matching". 


SEE can independen- 
tly decompose the left 
and right scene into the 
bodies forming them, leav- 
ing as a problem to de- 
termine which of the ob- 
jects in the right scene 


corresponds to an object 


THEOREM S-1] 


If both cameras are identical, their optical 
axes parallel and the films or sensiti- 
ve surfaces or retinas lie in the same 
plane, 

a simple necessary condition for two 
image points, one in each retina, to 


have come from the same 3-dim point, 

is that both image points (left and 
right) have the same y-coor 
dinate, 

measured in the direction perpendicu- 

lar to the line joining the optical 

centers. 


in the left scene. This can be done because each object will appear 


in both views with the same maximum height and minimum height (highest 


and lowest values of the y~coordinate of points belonging to that 


object); comparisons are easily made by replacing the objects by 


"intervals" consisting of these two numbers. 


Further disambiguation can be achieved by the use of the function 
(WHERE xX Y, X, Y,),. which determines the (x, y, z) 3-dim position 


L R.R 


of a point of which its two 2-dim locations (X> YX) and (KX, » Y) 
are know, {Griffith, AI Memo 143}. 
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Figure 'POINTsS* 


Given two images of the same scene, before 

we can proceed to situate it in 3=-dim space, 
it is necessary to know which points of the 
left scene correspond to points of the right 
scene: we have to discover the genuine pairs 
in it, a small subset of the cartesian pro- 
duct (a, b, c, d)® (e, f, g, h). It is 
desirable to have an algorithm that avoids an 
exhaustive search on this product. 


Genuine Pair (definition). A pair of points (PL > Pr») produced by a 
real 3-dim point of the scene in consideration. 

Theorem S-2 below gives conditions that a genuine pair must meet. 
A particularization will produce theorem S-1 above. 


THEOREM $-2 The left image P, and the right image PR of a point P 
have associated with them a variable, computable from 
(X, > Y) or from (X,» YR)? that will acquire the same 
value on PL and on P,. It is invariant under change 
of scene. 

For the case where the optical axes are parallel, 
this variable is simply the y-coordinate (Y,, = YX) or 
height of the image. 

For the case where the optical axes meet, this 
variable is y, an angle that plane P 


L 
with [° , the plane containing the optical axes. 


~C,-P-C)-P, makes 


Any monotonic function of y will be just as good. 
(cf. figure 'GENUINE PAIRS'). 


From the theorem, the algorithm (referred to in fig. 'POINTS') that 
we may use to establish correspondence between points in the two 


views is: 
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Compare only points with the same y 
(or the same y-coordinate). 

Points with different y can not 
come from a genuine pair. 


For each body, the knowledge of the 3-dim location of a few of its 
vertices will be sufficient to position that body in real space, 
achieving in this way the goal of this section. 

See Digression 1 in section ‘The concept of a body', for a 


different approach. 


T2-15° ¥e-5" 


Figure 'y- PARAMETRIZATION' 


From geometrical considerations and the coordinates of a 
point Py, in L, it is possible to attach to the line A-Py 
an angle y. Similarly, an angle is obtained for lines of R. 
It can now be said that a genuine pair (Py,, Pp) must 
have the’ same y's for P; and Pp. 

y is a physical quantity, namely the angle that 
the plane passing by the image Py and the optical 
centers C, and C, makes with the "horizontal" plane [. 
(['contains the optical axes). Clearly, for Py and 
PR to be produced by a point P in 3~dim space, the y 
of Py must be equal to the y of Pp. This is a necessary 
condition that is easy to check. 


A real point P of the scene produces a left image Ph (which has 


a@ certain value of y) and a right image P, with the same value of y 


(figure 'y-PARAMETRIZATION'). 


Thus, given a point in one scene, we 
have to search for its genuine pairs 
in the other scene among the points 
with its same y. They will be found 
along an straight line through A or B. 


R 


Parametrization of the scene is possible not only by using y; 
a monotonic function of y will do. 
For cemputational efficiency, it may be advisable to store the 


points of the scenes into arrays according to the vatue of their y's. 
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The function LINE maps points of L into lines of R. 
An image point P, may have come from different 3-dim points P, Ps. Bes 
all of them situated in the line of sight of P,. The right images 


of P, P', P', ... all fall in a straight line, which is the intersection 
of the shaded plane [called plane PC, “P-C,-B, in fig. ‘Genuine Pairs'] 
and the right retina. 
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When the optical axes are parallel 
a aT 
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In this case, points A and B on 


line C.-C, (fig. ‘Genuine Pairs') travel to infinity, and lines Pin 
and PAB become horizontal (parallel to C,-C_). The situation looks 
like , 


LR 


aa S 
A genuine pair (Py > P_) will 
have the same y-Ccoordinate for 


both of its elements (10.0 in 
this case). 


So that, given a left image point Py? we have to search only 


among the points of R with its same height, to find’"the" PR that 
will make a genuine pair (PL > P,)- 


But several genuine pairs may be found. Because on each hori- 


zontal line on R, many points may Iie. 


USE OF SEE IN STEREO PERCEPTION 


We can use the invariance of the variable described in Theorem 


S-2 to locate objects in three dimensional space, from a pair of ste- 


reo views (we will suppose parallel axes; other case is similerly 
treated) as follows: 


(1) Make an analysis of the left scene with SEE, identifying the 


(2) 
(3) 


bodies. 

Id. for right scene. 

Reduce each body to an interval formed by two numbers, its 
maximum and minimum height, specifying "closed" if the absolute 
extremal of the body is known, "open" if not. 


In this way we reduce each scene to a set of intervals (see 
figure 'INTERVALS'). 
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(4) 


closed 


closed 
open 


Each body is reduced 
to an interval. 


Use these intervals to select which left body will go with what 
right body. The answer is simple (because it is unique) even 
in moderately crowded scenes. 

It is simple to take into account the fact that an open 
end of an interval indicates that the interval can extend 


further at such end. 


Sources of difficulties are: 


(a) 


(b) 


(c) 


Two bodies have the same interval, meaning they have identical 
maximum heights and minimum heights. This is possible. 


Quite easy: reduce some faces to intervals and compare them. 


A body is seen in left scene but not in right scene (figures 
L12, R12). 


SEE partitions one body in two in one scene, but not in the 
other. 


The "open" and "close" indications will help here. 


Also, remember that we are using, when comparing these intervals, 


just a very small part of the total information concerning each body. 


When the selection is narrowed down to two or three candidates 
["left-body 1 is either right-body 2 or right-body 5 "], one can use 
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(1) the WHERE function of Griffith (op cit), 


(2) as in (a) above, the intervals for each face of the 
objects, so as to chose as "genuine pair" those two 
objects with more agreement in the intervals of their 


faces; 


(3) perhaps a face of unusual shape is enough for discri~ 
mination, if it appears both in left and right scenes, 
or the number of vertices below the center of gravity, 


OL eee 


summary 

In summary, I should like to point out that, while much 
has been Stated within the somewhat constricting frame- 
work of this article, much remains to be stated. Certain, but 
not all, important classes of presentations have been 
treated, and there remain horizons as yet unexplored. Con- 
ceivably, the author will attempt, ex nihilo nihil fit, to estab- 
lish a more general perspective in the course of a subse- 


quent article. (D.4 Jomes, Datamation Wov 68). r | 


Also, the reader is referred to other 
articles on the same topic. 
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FIGURE "R 1 2" 


For the pair L12 - R12, caution 
f should be exercised, because an 
hexagonal prism disappears from 
L12 and a brick appears in R12. 
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Scene L10~ R10 sez analyzes independently (pages (25 and (28) the left 


and right scenes, obtaining the following bodies: 


(BODY 2. 18 86 3435 87 831.314) 
(gODY 3. 18 88 39 840 33) 
(BUOY 4. 18 82 3813) 


(BODY 1, IS 483 485 436 4314) 

RIGHT SCENE (R10) (BODY 2. IS 4813 Rei Asli “289 4215) 
(BODY 3S. IS *88 “£82 4310) 
(BODY 4. 3S X84 487 *%312) 


For each of the eight bodies, we compute its minimum height and its 
maximum height, obtaining the following intervals: 


L10 R10 


35 51 24 812 166,105) [67 says x83 425 x56 x314 


260345 87 F141 514-+179,120] 79 19) 4343 X81 X811 KE 4215 
28 39 710 33 _, [68,152] (65,103) «— 428 x22 %210 
#2 8150 —+ (21,82) 99 90) 5 xg 87 x312 


These intervals are compared (left with right), trying to find 
pairs with discrepancies between their values tolerably small [if the 
interval has an open end, differences can be larger]. For 'L10 - R10', 
these are 

[66,105) = [65,103) 
[79,120] = [78,119] 
[68,152] = [67,154] 
[21,82) = [22,82) 
that corresponds to the following identification of bodies: 

25 s1 $4 812 corresponds to 438 £22 %310 

36 245 87 %44 &44 corresponds to 4213 %31 Zsli x89 4215 
28 $9 $10 %3 corresponds to £433 435 436 4314 
$2 813 corresponds to X24 %37 %*312 


Once these correspondences between objects in the two images are 
found, the function (WHERE ...) {Griffith} will position these bodies 


in three-dimensional space, achieving our goal. 
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oz 


€13-£11-81-$9-815, 


FIGURE 


. 
Once this correspondence is know, geometric considerations alone {see Griffith, AI Memo 143} permit us to fied thefr three-dimensional positicas. 
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LOOKING BEHIND 


When I started to work On these problems, the idea was to 
describe an object by using a model, and with this model in memory, 
to search the scene looking for sub-parts of it that would fit the 


description. 


This work ended (as far as this thesis is concerned) with a 


program that finds bodies without having a model of them. 


But that is good. 


We did not know at the beginning that this could be done. 


LOOKING AHEAD 


a. Suggestions for further work 


b. Comments 

All these matters are 
normally encountered 
d. Summary grouped in a chapter 
at the end of the work 


Cc. Recommendations 


e. Conclusions 
f. Evaluation 


Bo Extensions and Implications 


I can only partially lump all these important matters in one 
final section; many times I cite them in context, that is, next to 
the figure or subject that evokes them, or with which they are most 
closely related. As a result, they are spread through the body of 


this dissertation. 


Also, 


(1) The box [SUGGESTION] appears through this thesis near a 
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partially unsolved or partially formulated problem, and/or its 


partially outlined or partially new solution. 
(2) In page 256 there is a list of such suggestion boxes. 


(3) The remaining portion of this section and, in general, the 
sections close to the end of this work, abound in statements 


of type .a.) through (g.). 
(4) I have tried to start each section with a brief, and end it with 


a summary or conclusion. 


(5) The section ‘Introduction’ (page 10 ) specifies the problems 
treated in this thesis, and the section ‘Preliminary view of 
Scene Analysis’ (page |4 ) produces a general view of available 


methods, 
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General notation 


To put, remove, etc., Links, we SUGGESTION 


may develop a notation that will look like 
(WHEN A (YA) (B21 C 33D :2) 
D (KC AF.-)) (A 233 E 24 F 32) 
THEN 
POT LINK KIND3 3:3 24 


NO LINK o:1 32 ) 


"When A is a vertex of type 'Y', and 
D is a vertex of type 'K', and 
A and D are joined aa specified, 
then 
put a link of kind 3 between region :3 and :4, and 
do not put a link between :2 and :1," 


The general notation is 
(WHEN P EE’) 
“when predicate P is satisfied, evaluate expression E (execute 


E), otherwise execute E' (which may be missing)". 


In this notation, the predicate P corresponds to a geometric 
pattern or configuration, and the expressions E and Et to the esta- 
blishment or removal of links. 

In SEE, this part is handled by LISP functions (hand-coded), 
one for each particular heuristic. The suggestion is to develop this 
general notation, and an interpreter for it. This will speed up 
programming and checking, but will slow down the execution to 
some extent. 
vec The main use of the new notation or language is for trying 
new heuristics. Actually, it is not difficult to hand-code the 
new heuristic in LISP (see function EVERTICES in listings), because 
everything reduces to calls to NOSABO, THROUGHTES, GEV, SUME, etc. 
I was thinking that a simple MACRO of Lisp could transform from no- 
tation (WHEN P E E') to LISP functional calls. 


Since what the notation or language is really doing is expressing 
as a linear string a two-dimensional configuration >» @ more am- 


bitious project would be to use the light pen and draw this configuration, 
and then have our interpreter or compiler produce the LISP program. 
This may look a little like AMBIT-G {Christensen}. 
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Assigning a name to an object 


Problem. SEE has separated a scene into bodies. What are they? 
Is there a pyramid among them? Where are the parallelepipeds? 


To answer this, information can be supplied to the program, in 
the form of a symbolic description or model of the object we are 
trying to find. A model is an idealized account of a class of objects, 
all receiving the same name, like "triangular pyramid" or "house". 
Models may have parameters that acquire values after a given instance 
of the model has been found in a scene. Examples are “height” or 
"length of bottom side". 

Some programs that follow the above procedure to name objects 
in a scene are described and discussed in a Master's Thesis {Guzman}. 
There are difficult problems to be solved if we are to make the 


system able to recognize occluded objects in many situations. 


One could, of course, bypass SEE and-look for particular objects, 
as it is done by Polybrick {Hawaii 69}, a program that finds paralle- 
lepipeds. 
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Do not use over-specialized assumptions. Use more information In 
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trying to solve a problem, people will apply quite different methods. 
They may also suppose quite different assumptions, some of which 

may not hold. Due to particular experience, environment, preferen- 
ces, etc., some subjects may be using over-specialized assumptions, 
instead of requesting more data, more information to solve the 
problem. We may bias our views and risk arriving at conclusions 

(of the "common sense" type) which are valid only on restricted 


segments of populations, or in particular conditions or situations. 


Holes. For instance, if most of the readers of this thesis [technical 
specialists, who have learned to read, are interested in graphical 
processing and computers, etc; who may not be considered a repre- 
sentative cross-section of Homo Sapiens] perceive "objects" a, b 

and c of figure 'HOLES' as holes {Winston}, we may be tempted to 


conclude that this is a general property, and rush to write a 


Fig. 'HOLES' 
The ideo that objects a, b, c 
have to be interpreted by all 
men, and hence by a program, as 
holes in the larger box, is 
dangerous. {cf. AI Memo 163} 


subroutine to find such orifices. Perhaps other sectors of our 
population would simply say, with respect to a, b, c, of figure 
"HOLES' that "there is not enough information to make a decision" 


(see also section ‘On optical illusions'). Or they may come with 
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different answers, using their set of assumptions which may be 
different from ours, since their experience is different too. 
The Ames' Room (see Box, page 20!) and Gregory (see Box) warn us 
of this. 


Oth as 
er example of over~specialization For people familiar with 


Descriptive Geometry, it is easy to see that figure 'DESCRIPTIVE' (I) 
shows a straight line in the first octant. For them, indeed, it 

is easy to visualize this line in three dimensions and have a fairly 
good idea of its position and orientation in space, just from 
figure (I). 

Other persons would need a more Conventional ¢gigure, such as 
figure 'DESCRIPTIVE' (II), to visualize the same line, to get the 
same idea. 

What happened was that the first group of persons were using 
especialized knowledge, their mind were erased figure (I) was 


familiar to them, etc. 


(IL) 
Figure 'DESCRIPTIVE' 


Conclusion sefore looking for heuristics and shorteuts, before making 
assumptions, deductions, etc., let us be sure that there is enough 


data to solve our problem. 
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Human perception versus computer perception Given a teosdinenaional 


line-drawing of a three~dimensional scene, the problem of finding 
bodies in it is inherently ambiguous: many 3-dim scenes can generate 
the same 2~dim scene. 

Multiple solutions are possible. More over, the metatheorem 
of page 9 guarantees that a solution always exists, and provides 
ways to construct it. We call this solution "trivial"; in effect it 
is trivial to write a computer program that will invariably find it. 

From the multitude of possible solutions, human beings select 
one, which is * different from the trivial, and call it "normal" 
or "common" or "standard" or "reasonable" interpretation of the 
scene, 

Our program SEE also selects one of the many solutions. 

How does its selection compare with the human choice? 


== When the scene is "clear", in the sense of evoking human 
unanimity, SEE will * also select that same answer. Example: 
Figure 'TOWER'. 


== As the scene or drawing gets complicated or ambiguous, mortal 
behavior deteriorates; opinions split, optical illusions may emerye 
(indicating contradictory evidence perceived), several 
plausible answers are emitted. 
The answer of SEE in these cases will * be found among the 
humanly plausible selections. In some cases, it may not agree 
with the majority. 


== Finally, people make mistakes. They will see an object that is 
not there, or will fail to see an object, or classify it as 
"impossible", 
But SEE also errs. It sometimes succeeds where people fail, 


more often it is the other way around. 


In an overwhelming majority of cases. 
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TABLE "ASSUMPTIONS" 


ASSUMPTIONS MADE BY THE PROGRAM 


These assumptions have to be obeyed for SEE to give good results: 


(1), 


The objects are three-dimensional solids formed by planes 


No needles or cardboards allowed. 


They produce a two-dimensional image or projection where all 


lines are straight ‘2. 


Faces have no drawings, marks, labels, etc., imprinted on. 


Objects do not have holes in them. 


See section 'On optical illusions’ for conditions for partial 


lifting of this assumption. 


2 See section 'On curved objects' for conditions for partial Lifting 


of this assumption. 


ASSUMPTIONS NOT MADE BY THE PROGRAM 


These assumptions are not necessary for the correct functioning of SEE; 
it will work well with or without them. 


Only prisms are allowed. 

The scene is a parallel projection, or isometric drawing. 

The objects are convex. 

The model or description of the object has to be known to SEE. 
The objects have to appear unoccluded or unobstructed in the view. 


The objects have “weight” in the vertical direction and will 
fall if not supported. 


The background is known in advance (See 'On background discrimi- 
nation by computer'), 


I repeat, these assumptions are NOT obeyed by our program. 
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II, Project MAC, MIT September 1967. 
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NOTES: Reports with AD numbers (such as AD-652-017, page 286) can 
also be obtained as follows: 
Government contractors may obtain copies of of this report from 
the Defense Documentation Center, Document Service Center, 
Cameron Station, Alexandria, Virginia 22314. Orders will be 
expedited from DDC if placed by your librarian, or some other 
person authorized to request documents. 


Other U.S. citizens and organizations may obtain copies of this 
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Machinery. 


Proc. FJCC = Proceedings of the Fall Joint Computer Conference 


Proc. SJCC = Id. Spring. (Spartan Books, or Thompson Booka, Co. 
Washington, D. C.) 
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ANNOTATED LISTING OF THE FUNCTIONS USED 


You do not have to know these things in order to use SEE (rea- 
ding 'How to use the program' in page 7€ is enough) or to understand 
what it does (it is explained in 'SEE, a program that finds bodies in 
a scene', page $@); these things are put here merely for completeness 
and to make easier the understanding of the inner workings of SEE. 

A listing is a formal description There is a stronger reason, 
however. A listing of the programs is a formal description, an 
algorithm, an exact statement in a formal language of what we may 
have been describing, perhaps inaccurately, in a natural language 
(English). It becomes the starting point of serious discussions. 
The reader who is skeptical at some point, or did not understand 
some English statement, can always clarify his doubts in the listing. 
To be understandable, the listing has to have annotations, comments. 


A mathematician is not foreed to explain his work always in na- 
tural language, but rather he is allowed to employ abstract notations, 
symbolisms, formelizations of his thoughts (indeed, it is preferable 
this way). A programmer should not hide his listings (he should not 
be forced to restate his algorithms in natural language exclusively 
{ 68}) and force his readers to use the ambiguous channels 


of his natural language communication. 


And this brings another point. Not only a programmer should net 
hide the listing (unless there are, bugs or incomplete subroutines) , 
I mean honest and reasonable efforts should be made to facilitate fu 
ture potential users the access to these programs. Include: 

== Documentation 

== Listings, tape or card deck names, etc. 
== §«€6Test data 
ma 


Printout of an interaction with such test data, 
including loading, compilation, execution, results. 


== Time spent (by machine and by man). 
See also R. Kain's letter {C. ACM March 67}. 
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k 
(FUNCTION 
(LAMGDA (5) (MEMQ (CAR WV) (CAR J)5)dd) 
(NCUNC (CAR K) (CAR 18)) Merge beth, by sncreasing (oar te) 
(2PLaACu K 
(COMPACTIFY (APPEND (COR K} (COR $8)))) 


(APLACA 48 (0) jeu te. Ths al scte R, becam, 
(RPLACD 98 U1) pe ala ete & x 


} tl owe longs t R. 
yn? 
(GE12 (CAR ¥) SLNKDDD)) 
(4GeETe (CAAR K) SLNK} DDD) 
SMB}? 
RESULTS we at almcst done , $5 contains nucles which were cncremented 4 foneby regions. 
(PRINT (QUOTE RESULTS)) 
(S—T@9 ¥v QO) Tebedy forms a *bedy’ from cack nuclus 
(COND (43 (PUTPROP § 
{NCONC {MAPCAR (FUNCTIUN (LAMOUA (a) (FOBODY (CAR WIND) 5B) (GETO S BODIES)) 
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